AcousticBrainz features

thwaller · April 5, 2022, 6:38am

I am playing around with Picard as it has been a long time since I looked at any of the features and options. I see that in order to use the AcousticBrainz portion, the files need to be tagged with the MB tags. Is there a way to avoid this?

Maybe store the tags for the files in memory vs the file, so things can be used and cleared once complete / Picard is closed?
Maybe store the additional file information in a separate file within the directory, similar I guess as to how Apple uses the .ds_store file (I think that is the name)?

outsidecontext · April 5, 2022, 7:08am

Are you talking about the AcousticBrainz feature submission? Don’t bother with this, AcousticBrainz will get closed, and the corresponding functionality has already been removed from Picard’s development branch. See also the blog post:

Also the corresponding discussion here on the forums: AcousticBrainz: Making a hard decision to end the project

thwaller · April 5, 2022, 7:12am

Yes. I see, ok. Does the future of the AcoustID play into this at all?

outsidecontext · April 5, 2022, 7:13am

AcoustID and AcousticBrainz are two totally unrelated projects. AcoustID is not even a MetaBrainz project.

thwaller · April 5, 2022, 7:16am

Yes, I am aware of that. I ask as I know the usefulness and issues with AcoustID have been mentioned before, so was just unsure if the theory for moving forward included or was completely separate from the AcoustID portion.

I am also trying to look into ways to use this data personally. So I want to be sure that before time is spent, that there are not plans or thoughts of discontinuing after hearing this news.

outsidecontext · April 5, 2022, 7:25am

No, there are absolutely no plans of abandoning AcoustID.

AcousticBrainz was a project to use machine learning to identify acoustic features, such as BMP and even genre, from the files. One goal was to use this data for music recommendation, but the data turned out to not be of good quality, and the MB team did not see a way to move forward with this approach.

AcoustID on the other hand was created as an acoustic fingerprinting solution to identify audio files and match them to the corresponding MusicBrainz metadata. It was created as an open replacement to the formerly used closed source MusicIP PUIDs. AcoustID has been around for nearly 12 years now and has been very successful and proven to work really well. It is not only used by Picard but by other software as well.

thwaller · April 5, 2022, 8:07am

Has there been any thought to integrate or use the functionality that softwares like SongRec provide?

outsidecontext · April 5, 2022, 8:16am

I don’t know SongRec. What exactly does it do? What database does it use?

outsidecontext · April 5, 2022, 9:54am

I found it and investigated a bit: The app is GitHub - marin-m/SongRec: An open-source Shazam client for Linux, written in Rust. , which is an open source client implementation for the proprietary Shazam service.

It seems to have a free re-implementation of the fingerprinting algorithm. For lookup it uses an undocumented Shazam API, pretending to be an Android client.

I guess this could be integrated into Picard to do some basic metadata lookup. But Shazam has a different focus then e.g. AcoustID, and there are a couple of downsides:

The returned metadata is very limited (album, title, artist name, (original?) release date)
Shazam tries to be the machine equivalent of someone asking their buddy “what song is this playing right now”, and the other answering “That’s Space Oddity by David Bowie”. As such Shazam does not care about the exact recording. E.g. this 2001 original album recording, this 2018 remix and this 2008 live performance are all the same for Shazam and all return the same metadata and Shazam identifier.
The data is not linked to MusicBrainz, so an additional (ambiguous) search by album, title, artist is necessary
As said above the service is proprietary, with not officially documented API. Access seems only to work when pretending to be an Android client, which SongRec does by randomly choosing a user agent from this list.

There could be limited use for this. E.g. if you have a file with absolutely no proper tags to identify the song and AcoustID does not have a match, you could at least get a basic idea which song by whom it is for further searches.

But given the uncertainty of the API and the limited usefulness I would not put this into core Picard. But the functionality could be put into a plugin, if someone has interest in doing so.

thwaller · April 5, 2022, 11:57am

I am sorry, I was not thinking and just assumed the software was known, that is my bad.

Yes, you found the proper software. I understand that using Shazam is not the most ideal solution, as you stated. I refer simply to the functionality. There is an alternative to Musixmatch (which might provide their fingerprinting recognition portion in addition to the lyrics?) called AudD (https://audd.io/), and MB uses AcoustID, which at face value seems to be at least similar. There is an implementation using this called Mousai (GitHub - thwaller/Mousai: Identify any songs in seconds).

The above is all part of what prompted my interest in AcousticBrainz, wanting to play around with more and different to identify music. I found many times I could use this on my old volumes of MP3 files, which when they first came around, were not properly structured or organized into albums. After hearing your statements, there is overlap, here is example on two different uses:

I have a “album” of audio files. While it is not a real album, I can identify which album(s) it might most closely resemble… helping me turn the mess into a structured set, and
Take my unstructured messes and help me identify what other music could be added to correspond to the style I have in my messy collection.

An example fitting the above, which I see as common… I have a set of say 20 files, of the most popular recordings from an artist (sort of like a custom Essentials, Greatest Hits, etc) and I have a few other recordings of similar artists and maybe some that my main artist if a featured artist on. What this additional functionality can help with is directing me to maybe that primary artists real greatest hits release, releases from the artists said artist was a featured artist with, and other artists on similar style and interest. This can help me turn a random set of files that might my like a favorite customer CD into a respectable playlist that I might be able to not only collect but also import to my Spotify or YouTube music subscription.

outsidecontext · April 5, 2022, 12:24pm

Yes, I’m aware of AudD.

There is one big conceptional difference between AcoustID and Shazam or AudD: AcoustID can only identify from the whole file (or more exactly the first 120 seconds audio + the duration), whereas Shazam or AudD can identify from sound snippets anywhere in the audio. This is why these services are suited for e.g. identifying audio that is currently played back. lukz once mentioned that limiting AcoustID’s scope to the use case of identifying whole files and hence only needing to index the fingerprints of the start of the audio actually made it possible for him to host this service in a sustainable way (because it drastically reduces the needed storage and processing requirements).

The secondary difference that comes from these different usage patterns is how the detection is tuned. Because AcoustID was created for the use with MusicBrainz Picard and the definitions of an AcoustID rather closely aligns with the definition of a recording in MB the different recordings (like the original, remix and live recording in my example above) will usually end up with different AcoustIDs, while still allowing for some differences caused by encoding quality. Shazam and AudD are more tuned towards detecting the general song, because the detection of a currently played back audio is one of the key goals.