On the cause of hangs/crashes

There have been a number of posts in the past about Picard hanging/crashing when attempting to process a large number of files. Suggestions to address this have included breaking up the music files into smaller batches. I have been doing a lot of testing recently of my Classical Extras plugin, which has involved a fair number of writes to the log file. My observation is that Picard will hang if there are an excessive number of writes to the log file (in the order of hundreds of thousands). I have run the same code with and without the logging against the same file sets and it will hang if the logging is turned on and not hang if it is off. This behaviour has occurred identically with various versions of the code, log writes and file sets.
So maybe the cause of hanging with large numbers of files is nothing to do with the main Picard code, but merely a function of the number of log writes. Note that the total size of the log file seems to be irrelevant - it is the number of writes in the current session that seems to be critical.
Has anybody else noticed this? Perhaps someone could run an independent check of my hypothesis? Or better still, fix it?

5 Likes

I experience this issue more often than not. But it happens to me at about 2000 songs. It was the reason I enabled logging. My theory - I am just a novice - was a process is hanging
I see four msvcr90.dll!endthreadex+0x6f processes just spinning. So I increased the priority of these to “above normal” and set picard.exe to “highest” (NOT real-time) and in a few minutes, the app was responsive again. This is the first time i have tried this and I’m sure results will vary - if duplicated at all.
I didn’t know the app was written using python until now so there may be other places to look.
I’ll post any additional results - because it’s bugging me even more now.

-john

2 Likes

@wehavecookies4u, can I ask how long it took to tag those 2000 songs?

Tried that when Picard seemed totally unresponsive and after a few minutes it recovered!

Great. I have times where that hasn’t worked. (or i haven’t given enough time)
Large multi-track mp3’s (oversight) will also cause an issue.
A stop button would be ideal with a message of what track was interrupted.
More baffling is the inconsistent behaviour with the same tracks on two different machines.
i started to set up for a trace dump but a preview update caused some time without the machine. The other machine (both win 10) seems to be banging thru with a few hangs. I started to break up the albums into smaller sets and it runs thru. It starts getting SLOW. Clearing .\AppData\Local\MusicBrainz\Picard\cache\picard and temp seems to fix that.
Regards,
John

How do you turn the logging off?

Not sure you can turn it off, but you can make sure that the “debug mode” box is not checked (go to Help->View Error/Debug log).

3 posts were split to a new topic: Random crash reports

There were a lot of reports of slowdowns with huge collections.
There are many reasons for that, some just make sense (like many queries + rate limiting = slow or tagging huge files that need to be fully rewritten on slow network shares).
We also know UI can hang, mainly due to Python / Qt / threading issues. This will be addressed in future releases, but it implies a lot of changes in the code.

That said, most slowdown reports come from windows users, and almost none from linux / macosx users, so i suspect something related to the win version only.

Picard is, by itself, logging to stderr (on linux), but it seems it logs to a file in the packaged windows version.

Of course, debug mode is very verbose (that’s its purpose), people aren’t expected to use it all the time.
But it causes no slowdown on linux systems. The write rate is quite low by today’s standards.

It should be noted that all read or write operations from/to audio files are made through mutagen library.
It should be also noted, mutagen has to rewrite files in full to insert metadata in worst cases (it depends on size of metadata vs space reserved in the file).
Also some users are embedding full size images from CAA, that’s a damn bad idea (some images are > 50MB in size).
Of course, if one is reading / writing from network shares, it is even slower (especially on SMB).

So, digging this problem is a good idea, but please indicate exact environnement, software versions, disks and/or share, network speed, types of files, etc…

Logging network (http queries to web service), disk ops, etc, can also help.

4 Likes

FWIW, just as regards the logging issue: I was using debug mode quite extensively when developing the “Classical Extras” plugin and it was definitely causing problems (W10). I’ve now given the plugin its own custom logging feature, which is even more extensive but does not cause any hanging issues. It just writes to a file with the default buffering, which I think is a different approach to Picard. (Also separate log files are written for each release, to aid debugging, which helps to keep the file sizes down too).

Which version of Picard are you writing about ?
FYI, upcoming Picard 2 logging system changed.

Sorry, 1.4.2
Will look at moving onto 2 in a month or so.

My experiences after days of unsuccessfully trying everything to process my 40k track library… actually the only part that truly hangs indefinitely is the saving of processed files. The right approach is to add all the files at once in one go so it has the ability to do the best clustering. Turn off auto-scan and make sure album art is not ‘full-image’. Also drag the bottom ‘info’ panel to close it.

It will take a long time to import the tracks into unclustered but trust it will someday finish. Collapse Uncluster files folder to speed it up slightly. A proper progress bar would be super helpful here just to keep the user from worrying. Once imported then cluster. Again this will take forever but it will eventually finish. Make sure your computer is set to never sleep. Again a progress bar would be really nice. Then collapse the Clusters folder and lookup. Another very long test of your patience ensues but just trust it. Now here is where the issue lies, instead of trying to select all 4000 CDs on the right to save just select a hundred at a time and save. As soon as you click save however single click on the next CD in the list to deselect the ones it is trying to save. This speeds it up tremendously. Once they are all saved then go and scan fingerprints on the ones still unclustered and clustered but not resolved.

I think the easiest approach for the developers to fix the hanging issue is to internally execute the save process across the entire selection list but first chopped into 100 file chunks or even issued individually to the lower level routines one at a time. Hanging appears to only occur when saving too many CDs at once. I don’t think fixing this in code would honestly take much time if this approach is taken. Sure, long term you want to fix the entire architecture of how it works but short term it seems this would solve most of the issue.

1 Like

This is usually advised against. While part of the advise has been technical, I think the main reason nowadays is that the UI does not really help you much if you have too many files loaded and you have to check and maybe manually correct things. E.g. having files of a single album distributed across multiple releases can be difficult to see and correct if you have a few hundred other releases in between :smiley:

For that I can only recommend to new users to start with smaller batches of a couple of albums and work up to an amount the user can handle once the user know how Picard works and how the different matching options behave. Of course this also depends on the quality of the existing collection and your personal preferences. E.g. if your files are all thrown together and not easily sortable by album you actually might want to throw in everything and use Picard’s clustering (as you said this gives better clustering results since you can only cluster what’s loaded).

3 Likes

You can submit a PR at GitHub - metabrainz/picard: MusicBrainz Picard audio file tagger

2 Likes

I’m seeing the same problem with the ver 2.5.2.

The “unresponsive” message from windows typically indicates the app is not servicing the windows message queue. However, it doesn’t mean the program is hung.

A common problem when servicing an unindexed lists, is that performance drops off exponentially as the list size grows. This is a result of the underlying high level language having to traverse from items 1 to n, when referencing item n. The developer many not even be aware that this is happening.

For example, say you are processing item 100 in a list, and then ask for item 101. The underlying code will need to go to item 1, then 2, then 3 … up to 101. And each time you add another item, the cost of moving to the next item grows larger and larger.

By the time you get to a few hundred items it all grinds to a halt. It looks like the program is hung, but in reality is isn’t. But it may need days or years to complete.

1 Like