Seeking details of metadata returned by AcoustID web service

HDS01 · November 1, 2017, 9:53pm

(This was also sent to the AcoustID mailing list - apologies if you see it twice.)

I’m experimenting with using the AcoustID Web Service from within a MacOS application, which communicates with a remote database.

In summary, I want to use the AcoustID web service to find a unique identifier for a recording (or array of such identifiers, if necessary), that can be stored in a database, and later used to match against subsequent queries to identify other instances of the “same” (sonically near-identical) audio file. The specific release format and textual metadata are essentially irrelevant - we care only about matching the audio content.

We are already using Chromaprint to generate fingerprints within our application, and successfully querying the AcoustID web service to retrieve metadata for these recordings, based on those fingerprints.

I haven’t found any documentation yet that explains the details of all the metadata that is available through the AcoustID web service. I’m also a little confused about the MusicBrainz IDs (MBIDs), as opposed to AcoustID IDs. If anyone can explain, or point me to an explanation of this, that would help.

But my main question is: which items in the metadata I am retrieving are the ones that truly identify the exact, unique recording?

I see that when I query the AcoustID web service, many songs produce an array of “recordings”, “releasegroups”, etc., not just one. I know this makes sense, because songs can be released multiple times in different collections. Is there ONE single GUID that is consistent across all instances of a unique recording, that can be used as a definitive identifier for all files with matching fingerprints? Or do I need to store ALL of the “recording” identifiers for any song, then search ALL them for a match when querying our back-end database?

Or am I missing something simpler and more obvious about using the identifiers in all this metadata?

Thanks in advance!

dns_server · November 1, 2017, 10:24pm

acoustid works by generating a hash of the recording and providing a web service to look up that information.
It is not perfect and it is possible for 2 recordings to have the same acoustid so you may want to look at other tags to pick the best one.
acoustid typically will contain a musicbrainz recording id allowing you to query that service for more information.

https://acoustid.org/webservice
The general process you will want to do:
generate your acoustid as a string of numers and letters.
Query the acoustid web service, include meta=recordingids so you get musicbrainz recording identifiers.
parse the results,
Query the musicbrainz web service for more information on each recording.
From the example in the documentation cd2e7c47-16f5-46c6-a37c-a1eb7bf599ff was one of the recordings that matched the acoustid.
You will then want to querying the musicbrainz web service with something like the following.
https://musicbrainz.org/ws/2/recording/cd2e7c47-16f5-46c6-a37c-a1eb7bf599ff?inc=artists+releases&fmt=json
See https://wiki.musicbrainz.org/Development/XML_Web_Service/Version_2 for the musicbrainz web service.

Freso · November 2, 2017, 7:59am

If all you want is «instances of the “same” (sonically near-identical) audio file» you should be able to disregard MusicBrainz identifiers entirely and just use the AcoustIDs. Do note that AcoustID is not infallible and that it has been known to, in the past, group together e.g., normal and karaoke versions etc. Also chromaprint currently only looks at the first 2(?) minutes, so if two tracks have roughly the same length but divert acoustically after the 2 minute mark, AcoustID will not pick that up. (Both of these are probably rare and insignificant enough that they’ll likely not matter for whatever your usecase is, just thought I’d bring it up.)

Also, you mention “we” several times. If you’re a commercial entity wanting to use AcoustID for your product, you should subscribe at https://acoustid.biz/ to help keep the service running.

aerozol · November 2, 2017, 8:15am

In short:

This is literally a description of an AcoustID, and the AcoustID database.
We usually also attach the AcoustID to MusicBrainz IDs which store all other metadata, eg track artist, release, title & almost anything the heart desires.

HDS01 · November 2, 2017, 8:41pm

Thanks to all for your replies!

Yes, to aerozol, this is what I was hoping for. Are you saying that the AcoustID database stores ONLY fingerprints and their associated ids (AcoustIDs and MBIDs, but no metadata)? I notice that the AcoustID web service provides a considerable amount of textual metadata, even if I don’t specifically ask for MBIDs. If I ask for no metadata at all, then all I get is something that looks like this:

{
    results =     (
                {
            id = "3fb2e35c-c4d5-4360-aaff-8dad0aad05e9";
            score = "0.976727";
        },
                {
            id = "b262c597-64b3-42c4-94b0-e89b8c496d76";
            score = "0.9310389999999999";
        },
                {
            id = "49f64d53-de52-4f4b-a08d-8d78ca0001cf";
            score = "0.690862";
        }
    );
    status = ok;
}

I’m assuming that the ids above are all AcoustIDs, not MBIDs. There are more than one (in this case), but it’s easy enough to pick the one with the best score, which I assume is its result from the chromaprint match algorithm. I’m not sure why I should get more than one though - any explanation is welcome.

If I’m correct (please confirm or correct me!), then this is the basic raw information I need. But I do want to understand what all the other metadata is, and where it comes from - it’s certainly possible that it would be useful to me a little later.

If I understand correctly:

AcoustID provides a web service that can deliver an id (based on fingerprint), and other textual metadata. It also knows the MBIDs that correspond to its own ids.
MusicBrainz provides a different web service that can deliver textual metadata, which is selected by MBIDs, from its own separate database.
If I want LOTS of metadata, and my starting point is a fingerprint, then I should query the AcoustID service to retrieve the MBID for that fingerprint, then use that MBID to retrieve further metadata from the separate Musicbrainz web service.
I can also get textual metadata from AcoustID without any query to the MusicBrainz web service. Is this also coming from MusicBrainz? If so, how current is it? If not, where does it come from?

Clarifications welcome!

The problem that you’re all helping to solve is that it’s very easy to be confused by the results of a query to the AcoustID service - it’s possible to get a LOT of metadata back, but it’s unclear where its coming from or what all of it means (even the MBIDs are not identified as such!). If there’s clear, detailed documentation available of the possible results from a query to AcoustID, I haven’t found it, but would love to see it.

BTW, to Freso - we’re not a commercial entity - yet. “We” is just me, most of the time. But “we” certainly hope to become a commercial entity, and would very happily pay all associated fees, etc. So your help is not only appreciated, but actually might help to get us to that point.

Again, thanks to all! I will probably have more questions, but you’ve already helped considerably, so know that it’s appreciated AND effective.

reosarevok · November 2, 2017, 9:06pm

I’m not sure about the API, but at least the AcoustID site itself does have (their own) metadata which seems to be based on the tags on the files that were scanned in order to submit the fingerprints for this AcoustID. It can be quite useful sometimes, especially in cases where there’s nothing in MusicBrainz.

lukz · November 2, 2017, 10:09pm

There is not a single “AcoustID” for a single song. That would be ideal, but it’s an unrealistic goal. When you use the AcoustID service, you get back a list of AcoustIDs matching the fingerprint. You should treat AcoustID similarly to how you treat Google results. Google doesn’t give you a single result either.

An AcoustID can be linked to MB recording IDs (MBIDs). You can get the AcoustID API to return these MBIDs, but also some metadata from MusicBrainz. AcoustID has a copy of the MusicBrainz database. For example, if you add “recordings” to the “meta” field, you will get recording names back.

AcoustID can also have some user-submitted metadata, that is generally of low quality, but as @reosarevok said, it can be useful if there is nothing in MusicBrainz. You only get this data if you add “usermeta” to the “meta” field.

HDS01 · November 2, 2017, 11:00pm

Again, thanks to you all, this is becoming clearer, and so far everything said makes sense. So Lukas, am I correct in assuming that the AcoustID with the highest “score” is the best matching fingerprint for a given recording?

Meanwhile, I will do some experimentation and see how well things work, now that I have more understanding.

I do have some other specific questions about the fingerprinting process, which are not urgent, but eventually quite important to our application:

I know that there are many cases where the difference between two “near-identical” recordings is nothing more than a different duration of (relative) silence before actual audio signal begins, at the front of a file. Since the file duration is slightly different (but otherwise contains the same audio), in theory this might produce a unique fingerprint for each of these files. But I would also guess that a clever fingerprinting algorithm might expect that, and ignore the leading silence.

Can you explain how AcoustID handles this situation? If I could have the most ideal solution, it would be to know that the two files are “identical” in terms of content, but that they each started at a specific offset from the beginning of the file, and to know those offsets. Something could be similar for trailing silence at end of file, but that’s not something I care about. Is there any way to learn the “starting point” of valid audio signal (in milliseconds), relative to the beginning of the file, for each fingerprint?

Does AcoustID fingerprint the entire file, or just a portion of it (120 seconds, as stated in Freso’s answer above)? Is it possible to have it examine the entire file, and if so, how?

Again, thanks!

lukz · November 3, 2017, 6:39am

If you are looking to identify duplicate songs, you are better off not using AcoustID. Look at the problem through the Google analogy to see why AcoustID is not useful there.

The highest score gives you the best matching fingerprint at a given time. The best matching fingerprint might change when other fingerprints are submitted to the database.

AcoustID currently does not ignore leading silence. That was an intentional decision. Chromaprint implements a variant of the fingerprinting algorithm that ignores leading silence, but that should not be used with AcoustID.

The AcoustID database only stores the first 120 seconds of fingerprinted audio for the vast majority of songs. Examining the entire file would not help you if you use AcoustID.

If you are looking for duplicate detection, you will get much better results if you just use the Chromaprint fingerprints and implement the rest of the solution yourself without using AcoustID.

HDS01 · November 3, 2017, 11:20pm

Thanks Lukáš!

I think I see a way to solve my problem, assisted by your response and also by an older thread from the mailing list discussion, here:

https://groups.google.com/forum/#!topic/acoustid/C3EHIkZVpZI

It seems, if I understand correctly, that if I have the ability to create fingerprints via Chromaprint (which I have, already working in my application), then the comparison function is relatively simple. So I could create a very simple C application that takes two files and compares them, just by using some of the code extracted from Chromaprint.

This is what Christophe did in the email thread above - my version would be quite similar to his, and he posted source code for it as well, which I’ve looked at.

I notice though that you mentioned match_fingerprints2 (in the earlier thread) as the function that compares whole audio files. I only see a function called match_fingerprints3 in my current version of Chromaprint. Can you explain the difference between the two functions?

Also, am I correct in thinking that you are the creator of Chromaprint as well as AcoustID?

In any case, I plan to experiment with creating such an application, and if it works, that might be the best solution for now.

Again, thanks to all!

lukz · November 5, 2017, 10:37am

I’m sorry, I don’t remember the exact details between the various versions. You might be interested in the discussion here - https://github.com/acoustid/pg_acoustid/issues/1

You are right, I created all the AcoustID components.