Covert Art Archive fingerprinting

henadel · May 22, 2017, 4:20pm

Hello.

Sorry, this question is regarding Cover Art Archive, but I don’t find any specific discussion. I would like to know if fingerprinting covers (like acoustid for audio) has been a subject for the developers? The idea would be to look for a cover from an image request. Such a search will lead to get all metadata from musicbrainz based on an image fingerprint.

I think that could be a cool feature, and a great manner to store album covers.

Thanks,

mfmeulenbelt · May 22, 2017, 8:26pm

I think that would be pretty cool. For example being able to take a photo of a release and looking the release up through the cover. It would probably be very hard to make it work reliably though.

Freso · May 22, 2017, 8:51pm

I feel like this is not necessarily something MetaBrainz needs to do. If we make sure that Google and TinEye etc. index the CAA images, then it should be possible to use those services to do just this. At least for Google and DuckDuckGo you can limit results to site:musicbrainz.org (or archive.org or coverartarchive.org, whichever the images get indexed under). TinEye probably has something similar.

I do like the idea a lot though. Seems more versatile than scanning the barcode—and hoping the barcode has been entered into the db… Which even then only works for releases that actually have barcodes.

henadel · May 23, 2017, 4:48pm

I think the cover fingerprint is quite reliable, depending on what you want to detect of course (slight rotations, distortions, colors…). It seems at least as hard (easy?) as audio fingerprinting, which musicbrainz already does.

henadel · May 23, 2017, 5:01pm

When I (recently) discovered the Covert Art Archive, my first question was why this was not part of musicbrainz entirely. Google and TinEye are a great way for a human to search by images, but when you try to automatically match a given page to an album, let’s say a review, you have three main ways to do that:

The page references external databases such as musicbrainz. Thus the matching is already done by the writer.
Name Entity Recognition: with Natural Language Processing techniques, you find which album is concerned and match it later with musicbrainz
Image Matching: if the page displays the album cover (which we hope), searching by the image (its fingerprint) in a database such as the Covert Art Archive will match it to all the corresponding metadata in musicbrainz

In my experience, the third way is reliable and easy to implement (easier than 2). The prerequisite is to have the cover in the page.

In a more conceptual way, one can legitimately consider the fingerprint as an image metadata (like it is done for audio), without questioning the use of it, and without implementing a service to search by image (third applications can implement it and use the fingerprint for a local comparison).

I still find the idea interesting and I am eager to contribute on this if I’m not the only one who thinks it’s interesting!

Thanks for your feedbacks.

aerozol · May 24, 2017, 8:59am

Such a cool idea!!
You’d really have to spearhead development though, not sure anyone is going to put their hand up for a new project at the moment, there’s heaps to do on existing projects.

Freso · May 24, 2017, 6:21pm

We don’t, actually. AcoustID does the fingerprinting and we interface with them. Any AcoustIDs you see on MusicBrainz.org are fetched from their servers (via JavaScript, so it’s all happening in the browser too). We don’t store any audio fingerprints ourselves (anymore).

I’d imagine a similar construct would be desired for image fingerprinting/recognition.

Also, I don’t think anyone doubts the potential usefulness of this, but the paid developers have their hands full for years to come. As @aerozol says, it would need to be someone from the community who spearheads this if it’s going to happen anytime soon. This could be you. I’m sure our developers would be happy to help you as much as they can, but we just don’t have the resources to launch new projects ourselves at this time.

henadel · June 1, 2017, 9:46am

Thank you for your answer. For your information, I was thinking of such a package (here in Python): https://pypi.python.org/pypi/image-match. It’s seems a reliable a scalable solution. I used it in my compagny and was very satisfied with the results.