Recording longer than 30 minutes will always fail the extraction when submitting to AcousticBrainz

Jeluang · March 15, 2019, 6:50am

I’ve been submitting some of the audio files that I have to AcousticBrainz using the extractor on Windows 10. After going through some recordings, I noticed that certain recordings will always return a failure no matter how many times I re-downloaded the audio files and re-tagging them just in case there was a corruption in the files themselves. After going through several rounds of trying to submit them, I realised that all the audio files that return a failure are more than 30 minutes in length. Is this intentional or is it a bug?

alastairp · March 15, 2019, 9:35am

Hi!
We don’t specifically block tracks longer than 30 minutes, but the processing step requires quite a bit of memory and so longer tracks do have a tendency to cause the extractor to crash. sorry
How much memory do you have in your machine? We made some fixes that should allow tracks up to about 2 hours, but this depends on the amount of memory that you have too. Unfortunately we don’t have any developers that frequently use Windows, but we’ll try and look into it. Can you give us a few examples of MBIDs for files that fail?

Jeluang · March 15, 2019, 1:40pm

My PC has 4GB of memory and the audio files that I try to extract are podcasts so it’s natural that it’ll be long.

Here’s the MBIDs for the audio files that I tried to extract.

Freso · March 16, 2019, 7:37am

Are you able to monitor your RAM/memory usage while abzsubmit is running? Are you using the 32-bit or 64-bit extractor?

Jeluang · March 16, 2019, 12:10pm

I didn’t monitor my RAM/memory usage while the extraction was ongoing. And is there a different extractor? In the download page, there’s only a single download for Windows and when downloaded, there is only one executable for the extractor. There’s also the streaming_extractor_music.exe but I don’t know what that does.

alastairp · March 18, 2019, 10:03am

We only distribute the submission GUI in 32 bit for windows, which might be why Jeluang isn’t sure what extractor is being used.

If you only have 4GB of ram, then I’m pretty sure that this is the issue. We’ll try and reproduce it with the links to the podcasts that you’ve linked to.
In more practical terms, the acousticbrainz extractor tool works best on short music recordings, and so we don’t get a lot of value from podcasts at this point in AcousticBrainz. Don’t worry too much about not being able to submit these items.

This is the actual program which performs the calculation of data from audio. The submission tool is a separate program which calls streaming_extractor_music and submits the data. You don’t need to worry about this - we include a copy in the submission tool.

Fabe56 · July 8, 2019, 4:41pm

As I have a 12GB cloud computer with Windows 10, I try all the files @Jeluang would like to extract. And I was able to submit them to AB except the longest one of with 49:51 track length, not sure of the max we can analyze with 12GB but for sure a 2 hours need lot of RAM

https://acousticbrainz.org/5434b3b9-e0ac-47a8-a784-40ef8477d89b

https://acousticbrainz.org/53238459-1fe6-4c87-9feb-d157748b91ab

https://acousticbrainz.org/07218fb1-fb12-49af-94c0-975bcc030c6e

https://acousticbrainz.org/24a33b2a-c419-4828-8010-cf93f57f6f0a

alastairp · July 9, 2019, 10:13am

thanks for submitting these!