Picard is extremely slow on larger amount of songs


#1

Hi.
I’m new to Picard and think it’s a great program, as i got many mp3s with kinda wrong tags or partially no tags at all and it works pretty well recognizing them by “analyzing the fingerprint”.

But my huge problem is: my song library consists of like 120000 Songs. Kinda 1/3 of them are taggeg correctly and i try to do the other 2/3, but as i drag in more than 300 songs into the left pane with uncategorized songs, it takes me a lot of time, until they are shown and so long Picard is just frozen.
Even worse: after they are in, and i’m saying “analyze”, Picard starts to sort them into the right pane, and just freezes after the list is as big as the screen.
I’ve thought it’s just slow and let it run for other 2h, but nothing happens anymore.

For analogy: when i do exatly the same with like 200 songs at a time, it’s still pretty slow, but the work is done in less than 15 minutes.
Also with like 100 songs Picard needs just a few seconds.

Seems like it can’t handle such amounts, which i do not think are “huge” at all.

PS: Using Picard on Windows 10 1803 x64, got 32GB Ram and 500GB SSD, so the hardware should not be too weak.

Is there any workaround, so i could do at least like 1000 songs at a time? Would save me a lot of work


#2


Have a look at the above post and then see if you can find out how to turn off logging - the following approach from that thread might be all that is actually do-able as far as “switching off logging”;

“MetaTunes
Feb 12
Not sure you can turn it off, but you can make sure that the “debug mode” box is not checked (go to Help->View Error/Debug log).”


#3

This problem also occurs in the latest development release of Picard 2.0:
https://tickets.metabrainz.org/browse/PICARD-1262

Mostly you get the answer:
Picard is made to process album by album, not collections, not x hundreds or thousands songs


#4

I don’t think you can switch it off completely, but setting the verbosity to error (bottom left of the log window) should produce very few log entries.


#5

Have you, like renegade2k, experimented by timing how long Picard takes for 100 songs, 200 songs, and then 1000 songs?

I can see that 1000 songs might take a bit more than 10 times as long as 100 songs.
But if the time taken is blowing out to 20 or 80 times as long then there would seem to be something wrong that might be easily fixable.


#6

Thx for repy, you all.
@mmirG: my “debug mode” is unchecked
@mfmeulenbelt: can you tell me, how to exactly do this? When i go into “Help -> Error/Debug Log” there is only a checkbox to active/deactive the debug mode. This is unchecked.
Is this what you mean?

I’m surprized, that the working time kinda explodes on just a few more titles, than 200.
At the beginning i thought “hey, lets put 5000 songs at a time in and let him workd for the next 1-2 hours” but checking the task-manager i see no activity on my disk, so i think even when i let the pc work for the whole day, there will be no progress.

PS: Good you mentioned 2.0. Let’s go and test 2.0dev6x64 ^^
Edit: Whoops … dev6 does only start one time, as from then on it’s throwing an exception. Also see bug 1255


#7

It is really annoying to get accurate times.

If I load this URL into Picard’s search for album:


I get the 60CD Box metadata on the right side in a about 10 seconds.

If I try to drag & drop only the first 10 clustered CDs (177 tracks) from the left side to the right side, I get a spinning circle (on Windows, SSD, 24GB RAM) for 87 seconds.

Drag & drop them back from the right side to the left (unclustered) side, I wait another 36 seconds (no idea why this is that time consuming).

Drag & drop the first 20 clustered CDs (326 tracks) from the left side to the right side, I get a spinning circle (on Windows) for 152 seconds.

61 seconds to put them back to the left.

Drag & drop the first 40 clustered CDs (711 tracks) from the left side to the right side, I get a spinning circle (on Windows) for 331 seconds.

So on my side, I can’t see that the working time explodes. It seems to be slow as soon as they amount of tracks is higher then the usual album track number of - let’s say - 15-30.
The %-value in the matching settings seems not to have any time-influence in this process.

I don’t know how the matching process works, maybe some kind of Levenshtein magic?
But there must be something “wrong” processing a bigger amount of files. I don’t see how comparing 711 text strings in 1142 possibilities can be that slow. Especially if you can compare additional identifying values like the track time or even a track/disc number.


#8

It could be that this option only exists in Picard 2.0, my bad.


#9

Just ran a little test (debugging off). 2014 tracks with various tagging qualities but no MBIDs. Some are proper albums and some are sundry tracks grouped into artificial albums.
Took 5-6 minutes to load in lhs (this from a NAS - local would have been faster).
Clustering then took 6 minutes, resulting in 142 clusters and one track not clustered. No idea why this should take so long, this should just be an internal python process but I don’t know what the algorithm is.
Selected all clusters and clicked “scan” (I know “lookup” won’t find most of them because of the tag quality).
Forgot I had “Classical Extras” on, so the scan and match took 1hr 04mins (probably would have been less than half that time without the plugin, because of the “work” lookups. This resulted in 162 possible release matches and 48 clusters unmatched.
So, a bit slow but no hanging.
BTW re turning debug off - I think debug is only a problem if there are excessive messages. The advantage of setting it on and having the log screen open when running is that you can see what is happening.
All on a W10 64bit PC.


#10

An observation I have made with my usage of Picard. (Not the beta)

Something that can make a huge difference in time for even a single album is the artwork. If you have a release with lots of hi-res scans of a booklet then this can cause massive slowdowns as those files are retrieved from the artwork archive. That is hosted on a server that often gets hiccups on the bandwidth available.

It can be especially noticeable on boxsets. I find it bad enough for Pink Floyd’s 16 disc Discovery box set… so that 60 disc set you have shown above is downloading 60 images - one for each separate disk.

Maybe experiment with changing how much art is retrieved. Do you pull down all the art? Or just a single image per album?

Personally I find the way the artwork download is handled a little awkward. There is no way to stop Picard downloading images once it has started. Example: I add a new album into Picard and then Scan the files. If Picard now makes the wrong guess when matching the album it can then be a huge wait until all that artwork for the wrong album is downloaded before it will let me swap to the correct album.

I get a feeling testing is often done with more simple albums with less artwork involved.


#11

Another trick when handling loads of files like that. Temporarily shutdown your anti-virus. Otherwise it will be faffing around with every file you are editing adding more pointless delays.

Obviously only shut it down if this is the only task you are doing on the PC. Common sense required :wink:


#12

Also, if you’re syncing to OneDrive or another cloud storage solution, pause syncing until after you’re done.


#13

Plugins are the problem as they make lots more requests the the site.
Classical extra will make more api calls to collect more information and api calls are limited to 1 per second and this time adds up.

What you want to do is do 2 runs through your collection.

  • First disable all plugins and try and match recordings to entries in the database. Make sure you get the right recordings and release groups.
  • If the musicbrainz recording id and release group id tags is saved in the file the next time the file is loaded in picard it will directly look up the record.
    Enable all plugins you want at this point to add any additional information that they provide.

#14

Another datapoint: I’m processing 5000 songs, mostly singles. I didn’t do any clustering, just acoustic matching (my collection is mostly collections, so I’m sorting them straight into artist/title and ignoring albums). I let it overnight to process and it sorted all but 150 into the right-pane.

I hit ‘save’ to process them into their final locations. This I believe is only applying tags and renaming/moving, no cover art downloads or acoustic ID generation. It’s still taking about 15 seconds per track (over 20 hours) to move and tag the files.

This seems like it’s a programming bug of the “Big-O” variety that should be able to be easily fixed. I might go take a look at the code. (This is on v1.4.2)


#15

renegade2k,
I use the program Jaikoz from Paul Taylor. It tagges up to 5000 songs in reasonable time using MusicBrainz with AcoustId or Discogs (its alike if they are single songs or albums). If I try it with 10000 songs the waittime is much longer.


#16

Thanks for the hint.
I’ll take a look at this.
If it’s fast enough, i will be gitting all my songs done withing the test period ^^
But definitly i’m not willed to pay about 50€ for the same service, i get from picard for free, just because it’s faster.

Edit: I read in another forum, that the free version does have the limitation, that you only can save 20 tags and then it asks you to buy the license.
If this is true, it’s pretty much not what i’m looking for


#17

Sorry, yes that’s true. But the developer lives from the licenses of his (two?) products.
Jaikoz does not only match albums, it matches single songs too.


#18

I wanted to add something to this discussion which I think may save some time for peeps until this is resolved. I am working on a video compendium which will incorporate this tool in a few places. This is perhaps the best overall tool for straightening out a messy library. However, in the process of using the tool, I discovered the issue which is the subject of this thread.

I am using the latest build as of this date on w7 64bit. Sorry I can’t report the version because the tool is busy ATM, lol. But I will edit & add that later.

EDIT: version is 2.0.4

I have just processed about 5000 tunes & now its in the “save” process (which includes moving & renaming in my case). Its been going almost 3 hours and its going alphabetically & is on letter “D” ATM. Based on this we can estimate its about 1/7 -> 1/6 of the way done. Assume the longer edge & we get 7*3 or 21 hours to go for a total of about 24 hours. That’s a very long time, even for this many tunes. But it would be even longer if I had bailed, thinking the tool was hung & then tried to do this a few clusters at a time (as so many here suggest). The key to avoiding bailage is to look at your network activity if you are using a NAS or cloud, and also you can look at the target folder and sort by “date modified” to see the occasional folder pop out or get updated.

EDIT: The actual total time turned out to be about 9 hours & it did complete without crashes or freezes. Also, the tool itself became partially responsive to input at some point - no longer showing the windows “busy” spinner.

The tool doesn’t presently give any estimates or progress indicators. It only shows the default windows busy indicator while it is working. This is not good; it will cause many people to believe nothing is happening & bail if they don’t take a closer look. As a software engineer, I can probably help y’all here when I get time. There are a number of factors that could be at play, but if its much faster in Linux, that is a big clue.


#19

Philip will be happy to get your help on https://github.com/metabrainz/picard/pull/975, which is typically of the “Big-O” variety.
Thanks for contributing :slight_smile:

We definitively need more coders on Picard, so you’re welcome.


#20

OK thanks for guiding me in. I will take a peek & see if I can get up to speed on his PR.