Differences on fingerprint softwares/packages


#1

Sorry for all the topics on acoustic ids and fingerprinting, but now that Picard has been working per its specs, I am trying to understand the whole process in some detail…

So, I understand that I can submit ids through, for example Picard using my key obtained from acoustID site, and I understand for the most part how the two interact on this data. A summary of what I would like some clarification on is:

NOTE: “this” is a reference to a fingerprint generated and submitted to MB and its association in the acoustID site data

  1. How is this related to AcousticBrainz? I might assume that AcousticBrainz takes the AcoustID and expands on this data, using the MBIDs and acoustIDs to tie all this data together?
  2. Is submitting a AcousticBrainz “submission” and submitting an acoustID the same, related, or neither?
  3. There are other “names” akin to this like Echoprint, Shazam, SoundHound, etc. Is there any relationship between these and acoustIDs, as stored in this? Is there any relation to AcousticBrainz? Yes or no, is there any sharing of data regardless of other types of relationships? Maybe it is even actually one database with different brands?
  4. So I might wrap my over-technical head around this, do you have a visual representation of acoustID matching? I would not expect this, but something like those CSI shows where they show you the progress of fingerprint matching. I am looking more just for a representation of the identification steps in the matching of IDs and the determination of the match. Will Drevo has a quite reasonable outline of this, but I am unsure if that applies to all, just his work or what.

Sorry for the questions in so much detail, but I am hoping asking some starting questions to those who know will hasten my learning curve on this.


#2

#4 is taken care of.


#3

It isn’t as far as I know - AcousticBrainz is a completely unrelated project using essentia and stuff like that to analyze other things, and does not fingerprint as such.

acoustID is a similar technology, but that is the only relationship as far as I can tell. acoustID is open source and all the others are for-profit technologies developed by their own brands (and, AFAIK, also all unrelated to each other). I guess technically MusicBrainz shares some data with Echoprint, since they’re Spotify now? Shazam used Rovi data, last I heard.


#4

Here is a perfect example of why I query such information:
https://beta.musicbrainz.org/edit/50572477

Also RE this thread: IDs that are already there

If we as editors are to utilize these tools to their fullest, we need to have a full and solid understanding of their workings.


IDs that are already there
#5

The problem why you are not getting answers is because you are looking for, or expect, a relationship where there is not one.

AcoustID, Echoprint, Shazam, SoundHound are all completely different technologies. The only common thing is that they do something that could be described as “audio fingerprinting” or “audio recognition”. How they do is it different, what exactly can they identify is different.

AcoustID and AcousticBrainz do not have anything in common, at all.

Regarding visualizing the matching process. Play with this:

https://acoustid.org/fingerprint/37536920/compare/42535969

How it’s actually done is more complicated, but the core is the same.

Here is a new version of the matching algorithm that will be eventually used in AcoustID:

This one is even more complicated.


#6

Thank you for the information. I am not really looking for any specific end, but an end no matter what it is. I am currently working on a release, now with another editor, where this information is helpful. An answer like you have provided “AcoustID and AcousticBrainz do not have anything in common, at all.” is a very useful answer. It is also clear and right to the point, perfect.

The problem I think I and some others are having with the acoustID is the id itself is sort of useless in a sense. Just like the MBID numbers. You can show me 2 of them and we can all see they are different. But when asked different how, there is no answer. They could be 2 totally different releases, or they could be 2 releases different by only one recording, or only the country of release.

Please understand I am only trying to understand the system and process better, so as to create better edits. I can say that the edits I have just made to a release are arguably the best and most accurate edits to a release I have made since I joined here. Having a bunch of excellent tools to use is a great thing, but those tools are of no use if the user is unaware or ignorant of the workings. You end up with the tools being unused or even used incorrectly causing errors.

An example that might help… if you look at an ISRC number, you can tell certain things right away like the year and country. So if I have 2 recordings of Song A and there are 2 ISRC numbers, I might in this case be able to see that the first is from 2000 and the second from 2004. So right away these numbers have a meaning and differences showing. If I do the same with acoustIDs for those same 2 recordings, the IDs themselves are nothing but a random number, like a serial number… which again is only a unique identifier. Now if I could look at those two recordings and their associated acoustIDs and notice similarity, that would be great. Example:
Recording A - acoustID 1234abcd678
Recording B - acoustID 1234abce677
I can see those IDs show a similarity vs 2 completely different series’ of characters like:
Recording A - acoustID h37djn0htea
Recording B - acoustID 7nwer9iv8hy

On the second example ser, those two recordings could be the same except one starts with a guy yelling “Hi” and the other does not. I would never know this on the second set, but the first set could show me this as the ID varies slightly only.

I hope this makes sense as to what I am trying to learn and why. If there is no pattern or anything to see or use, that is also fine. But if there is, such information can only help editors who care about their edits and are willing to put in extra time.


#7

I knew it was a mistake to introduce the concept of “AcoustIDs” from the beginning. I didn’t want to do that. I wanted AcoustID to be just a search engine, but MusicBrainz users expected the system to be have like the fingerprinting systems they used before and so we got into this mess.

AcoustIDs don’t mean anything. What defines one AcoustID is totally arbitrary. It’s based on my bad choices from the past. You can only meaningfully work with the fingerprints themselves.

If you see two AcoustIDs, pick their fingerprints with the highest counter and compare those. If you see parts that are similar, it’s probably the same song. If you see parts that are different, it’s likely an edited version. It would be great to have more tools to visualize things like this, but AcoustID is pretty much on support mode at the moment. It’s possible to develop them, but I don’t have the motivation.

What is likely going to happen in the future, at least from my side, is a completely new service with different kind of IDs. Most likely based on the same technology, but the database will be completely recreated. That is what I always wanted. Use the first iteration of AcoustID as a seed to build something better. With the existing database, it’s possible to build a better one without the effort it took to build the first one.


#8

Interesting, thank you much for this information. To make sure I understand this… you can have two recordings that are similar to same, but the fingerprints are different. You cannot tell this by looking at the fingerprint string alone, you need to examine the page as you provided before, where it compares them side by side. Am I understanding this correctly?

When comparing side to side, what I am seeing is like a spectrogram of the recording in a sense, that I am comparing with the other. Is that also correct? Although it might be the case, I do not mean the exact characteristics of a spectrogram, but I refer to a visualization of the recording throughout its duration.

This is all interesting to me, so sorry for over-asking a lot of details.


#9

Right, it’s not a spectrogram, it’s much higher level data, but you can treat it as a spectrogram for the purpose of comparing them.

The long axis is time, the short one represents 32 separate features. If the audio is close in the time range, those 32 features will be almost the same. On the page linked, that’s represented as black. But it’s not perfect and there is a lot of noise. That’s why you can’t compare the strings. You need to check how many of those features are the same. At some point, you set a threshold what you consider “the same”. It’s much easier to do visually than explaining it or programming the code to do it automatically. :slight_smile:


#10

I think I got it now. I was able to find these to compare:
https://acoustid.org/fingerprint/21432938/compare/41079228
and with an offset of 19, I can see they are quite similar, as you stated, from all the black that results. I can also see that the fingerprint string is useless on its own as you said as well.

So for the one referenced above:
https://acoustid.org/fingerprint/37536920/compare/42535969
these two do not seem as similar. What are your expert opinion thoughts on this one? I think we need to consider the age of these recordings with the other criteria. Plus the fact that the originals were analog and they were remastered to digital to make CDs, so we also have variation caused by mastering techniques like compression and noise reduction and such.


#11

As you said, I’d expect those two fingerprints to come from different masters of the same song. Even if AcoustIDs worked the way I want them to work now, those would be considered separate.

What does that mean for MusicBrainz, I don’t know.


Different ISRC or masters (not) sharing the same recordings
#12

Thanks again. One last question quick… is there a way I can do this process local? Meaning that I take 2 recordings and compare the acoustIDs? Two examples on where I would use this now are:

  1. compare a recording that appears on a compilation CD and a regular album
  2. compare my FLAC file to a lossy compression format derived from said file

EDIT: I mean I know I can manually send files through chromaprint, but that output is not so nice as the comparison chart you produce online. So I mean an easy way in that sense.


#13

Not easily. You could use fpcalc and get the raw fingerprint output (fpcalc -raw -json) and somehow implant it to the page on acoustid.org. Or just copy the code and run it in your browser locally. There are no external tools to do this.


#14

@lukz - can I dig for more info?

So, I have the output you described, and output to a text file. This leaves me with 2 questions…

  1. is there a method of getting an output in the form of the fingerprint used in MB and the one in AcoustID site? Meaning the short and the long string.
  2. is there documentation on the exact makeup of this output? My thought is to be able to put this data into something visual I can compare, or have a script compare for me. But I would need to know what this data represents and how it is representing, in general, to do so. I can easily compare character to character, but for example if a series of those characters are nothing but the date it was generated, that compare is not appropriate.

Hope that makes sense. I will spend some time reviewing the materials on the site and man pages, just hoping you could give me a jump start on it. My reasoning is that based on a conv here on what makes a different recording, MB will not be able to do some of the things I want as the guidelines allow for too much variance. So I am happy to make some script of my own to do my purposes.

EDIT: I am finding some info on this, for example https://gist.github.com/lalinsky/1132166, I am just trying to put it together into a nice usable thing. I believe I am looking at your stuff there anyway, so…


#15

What do you mean by short and long strings?

If I understand correctly, by the short string you mean “AcoustID”? That’s not a fingerprint. The data model is like this:

  • One MB recording has multiple AcoustIDs (and one AcoustID can be assigned to multiple MB recordings)
  • One AcoustID has multiple fingerprints

So you can’t go directly from an AcoustID to a fingerprint.

The long string (loooking like this “AQAAC0kkZU…”) is a compressed version of the raw fingerprint. You can totally ignore that. It’s only useful for sending the data over network. Useless for comparing fingerprints.

The actual fingerprint is a list of 32-bit numbers. Each number represents a short time period, each bit in the number represents on of the 32 features I mentioned before. What exactly those features represent is arbitrary, trained by a machine learning algorithm. What more documentation do you need?

There is currently no API for getting the data. The best you can do is scrape the website.


#16

Wow… ok. I think I may have been crossing terms as well as being confused in the process here. I see it now… the “short” is the acoustID and the “long” is the fingerprint. Your assumptions were correct on my improper usage of them.

For documentation, before going any further, I obviously need to redo what I thought was an understanding of the system. So I will start there and the rest just might fall into place. Let me try to explain back to you what these items are and you can correct me further if needed.

So starting with a fingerprint. This is a compressed version of the raw fingerprint, which is that I look at when I use fpcalc and get a really long string - the actual data from analyzing the file. The raw data is useful, but the compressed string is not. The acoustID is a number, like a MBID, that has no direct meaning. It is a holder of fingerprints assigned to it.

If this is correct, what determines which fingerprints are assigned to which acoustIDs? You know, I think what I would like to see if I can is an ER diagram of sorts. Let me outline further though too…

  1. So I have a MB recording, which is assigned a MBID.
  2. I have a file of a recording, which is assigned a raw fingerprint.
  3. Raw fingerprint is compressed into the fingerprint I referred to as “long”.
  4. This fingerprint is now assigned to an acoustID
  5. the MBID links to the acoustID.

Am I getting a bit closer I hope?


#17

Yes, that’s correct, but I wouldn’t say that the raw fingerprint is assigned to a file. It’s extracted from the file.

I don’t want to go into what determines which fingerprints are assigned to which AcoustIDs. We are getting into the bad choices from the past territory here. I won’t give you the details, since I don’t remember how exactly does it work. The high level idea was that only “almost identical” fingerprints share the same AcoustID.


#18

Ok, So I could say this, given the logic you have outlined…

  1. an acoustID associates a fingerprint(s) that have a close similarity.
  2. an acoustID is assigned to a MBID by a user, manually, by submitting their fingerprints to a recording in MB.
  3. In a perfect theory, the acoustID and MBID should have a 1-to-1 relationship, as a change in acoustID should mean a more major change vs the fingerprints that were grouped.

I can use this by looking at the acoustIDs per recording. If there are a large number, that means there are a large number of fingerprints that were grouped with enough difference between them to cause the unmentioned logic to group them in. I can identify duplicate recordings in MB, ideally, by matching the acoustIDs. Now assuming that a fingerprint only has one acoustID ***, there is no need to look at actual fingerprints as duplicate or very similar fingerprints will all be assigned to the same acoustID.

Is that correct? As for the ***, is it also correct that the fingerprint has one and only one acoustID? Or is it possible that where ever / whom ever assigns the fingerprint to an acoustID assigns more than one acoustID to a given fingerprint or a fingerprint matched set?

EDIT: Thank you for going over all this in such detail. It is really appreciated. I hope my drilling into this is not causing you frustration or aggravation.


#19

Got side-tracked by some personal issues, sorry.

I’m not sure if your point 3 is correct. I had that aim at the beginning, but the definition of MB recording has changed (especially regarding remasters, especially ones that do not change too much) and I think it’s no longer true.

And regarding your other question, that is not necessarily true either. One fingerprint can be assigned to multiple AcoustIDs. Not the same fingerprint ID, but identical fingerprint with a different ID. That is because AcoustID also tracks the duration of the songs submitted with the fingerprints. That is mainly for performance reasons, but it means that two identical fingerprints end up on different AcoustIDs.


#20

Isn’t “AcousticID” an unrelated project to the MetaBrainz projects (i.e. by different people)?

I’d never heard of it until now, but now that I’m looking at the website, I’m struggling to see how there’s any connection to the MetaBrainz projects…

Or am I missing something?