Lookup vs. Scan - Which, What, When? What happens?

tdiaz · November 9, 2019, 4:49am

Another one I’m pretty sure I have a good understanding of… but I just want to make sure because it seems like there are different results depending on when, how many, chosen from clustered, un-clustered, multiple clusters or any combination thereof.

From the Left:

It’s my understanding that if you select some stuff and hit “Lookup”, it’s checking against an already existing MBID on those/that track(s)?

Where as a “Scan” would ignore what is there and Fingerprint it again, load the computed results based on what your sliders are set to do. (As best it can that is)

Why I’m questioning it is it seems like there is different behavior if you click on “unclustered”, with Lookup, vs. a cluster of items with Lookup.

Load it all, lookup across the unclustered list, get no matches/results. Cluster, and then use Lookup again on whole clusters, parts of clusters, parts if multiple clusters and you will get different reactions from the Lookup over each of those choices.

So, is Lookup also doing fuzzy logic with the other tracks that are selected when Lookup is chosen, or are Lookup and Scan always handled against a single individual track at a time?

IvanDobsky · November 9, 2019, 11:17am

This is my vague understanding… I have never read the code.

LOOKUP is using the MP3 tag data in the files. Not the MBID.

I thought the MBID lookup is “automatic”. If they are in the files, then the files are thrown to the right immediately. Unless the OPTIONS \ GENERAL \ Ignore MBIDs when loading new files is ticked.

I assume when clustered the LOOKUP is more intelligent as it knows it has an album gathered together so it can then see if the track count matches. (that is an educated guess)

Personally I make a lot of use of SCAN. I always hit that first to get the AcoustIDs inserted in the tracks. It then causes a dance to the right… and sometimes gets duff matches. So I gather it all up again and take it back to the left. Now I hit Lookup and we all do the Time Warp Again… https://www.youtube.com/watch?v=umj0gu5nEGs

tdiaz · November 9, 2019, 11:27am

I’ve got to … keep control!

Funny, I think of that one too, when pushing all the stuff back and forth.
I find that if I click on the Cluster title/folder … and it finds nothing …
Then I open it and select all the tracks and it loads up the proper album. I don’t get it.

IvanDobsky · November 9, 2019, 1:46pm

I thought that cluster worked on the album name as found in the TAG? So if no album name, there is no cluster.

Here is an example workflow of mine when tagging some MP3 files that came from the some old sources not ripped by me.

I dropped three albums onto the Left Panel.

Tracks with MBIDs immediately “Leap to the Right”.

Pressing CLUSTER (Put your hands on your hips) takes the rest and gathers them in groups based on the MP3 Album TAG. This also means a Disc 2 that has different tagging splits in to two clusters.

Next I’ll select one cluster at a time and hit SCAN to attach AcoustIDs. (pull ya knees in tight?)

The album will now Leap to the Right, but as it will probably select the wrong version of the album, I now highlight that album on the right and use Lookup in Browser followed by the TAGGER button to select exact release I want in my tags.

With the example I just chose for the test I got an odd result. It is an 8 track version of a release. Original had 7, a later had 8, but Picard selected one with 10 instead. Hence the need of the TAGGER button on the web page to get the correct option.

Now that the TAGGER button has selected the correct release and pushed it back to Picard I need to go find the files on the Right and push them into the correct Release. (I assume that is the pelvic thrust steps of the Time Warp)

Now I can hit SAVE and…

Its back to the left… and select that next album. This time a Release that split into two clusters due to it being a “deluxe” re-release with tags showing a different album name for disc 2.

I hit SCAN… and ARGH!!! stuck in LIMBO. I get two matches, release splits in two… first half matches to the original single disk release, and the second disk is stuck on “[loading album information]” as it grabs 23 image files… so I don’t even get a clue as to which version it is going for. (It did match the correct Deluxe CD2)

So I hit Lookup in Browser again and make my own choices while waiting for that art to download…

Returning to Picard I drag that wrongly matched disc 1 down to join disc 2 of the Deluxe set. Check the files look right. And hit Save.

Maybe this will work smoother if I put on my drag outfit like Dr. Frank-N-Furter?

The main point being it is clearly getting the matches correct, but my bias to using the SCAN button leads to AcoustID matches of the wrong releases when deluxe editions are involved.

(And yes, I know I could skip the art on this initial run, but for that I’d need to keep flicking an option on and off)

IvanDobsky · November 9, 2019, 2:04pm

What I find strange is that some people assume miracles are possible.

I have a CD here which has US, GB, FR, EU editions. Then Deluxe releases. And re-releases. But those first eight tracks have always been the same.

When the tracks only have basic MP3 tags in them it is impossible for Picard to ever know which of the above is correct. Both Lookup and Scan can only work with the data available. And the data is not unique.

SCAN will take the AcoustIDs and find best matches.

LOOKUP will take the MP3 Tags (and file names?) and find best matches.

But it is the human who needs to pick up the CD case and read the barcodes, cat nos, country details to make the actual exact match.

outsidecontext · November 9, 2019, 2:23pm

Most assumptions from the top are not quite right, so let me explain the various actions:

Lookup clusters: Lookup on clusters uses existing metadata (but not existing MBIDs) of a cluster to search on MB for matching releases. Since this takes the total number of tracks into account it often gives you best results if you want to tag entire albums and have reasonable good existing tags. This is really the recommended action to try first in most cases.
Lookup single files: Similar to lookup on clusters this uses existing metadata, but it will perform recording search requests on MB for each file individually, using existing title, artist, album, tracknumber, length and ISRC information. Use this if you are tagging individual files and don’t care about album.
Scan: This uses AcoustId audio fingerprinting. It calculates the fingerprint, submits the fingerprint to the AcoustId server to get an actual AcoustId together with a list of MB recordings that are linked to this AcoustId. It then chooses the recording from this list that best matches existing metadata in the file. The Scan operates on a per file basis and deals with recordings. Use this if existing file metadata is bad or missing and you still want to identify the file. Especially useful if you don’t know what recording this is yourself. Also use this if Lookup did not give you good results.

IvanDobsky · November 9, 2019, 3:23pm

Aha - 3. Scan - So it looks at AcoustID’s database on individual files, and then cross references to MB. So this is why my results can be so scattered.

So how should I be inserting an AcoustID? Do the LOOKUP on the cluster, jump to the right, match everything, then drag the matched cluster back to the left to do the SCAN?

If a matching release is sitting there on the right will that SCAN then use it? Or will I now loose my previous matches?

I am guessing that the SCAN button was added to Picard later than the LOOKUP option. It feels hard to fit it into a neat workflow.

hiccup · November 9, 2019, 3:46pm

Apologies @IvanDobsky for this brief off-topic hijack.
(but you can handle chaos )

I just want to say that I appreciate how well dragging and dropping matches within the right panel works when results are indeed scattered there.
I think that didn’t work so well in the earlier days of Picard, so thanks to the team for making that quite robust.

IvanDobsky · November 9, 2019, 4:08pm

I’ve only used Picard since 2017 - it used to confuse me too much before that.

It is important to realise all the links to the databases and how they all work. This then lets the human go look at the database to spot the holes which have caused Picard’s confused miss-match.

AcoustID gets REALLY useful on bootlegs. The titles of bootlegs can be so random that a normal lookup can’t find them by name, but nails them by AcoustID instead.

Best thing Picard has done for me is also make me come to MB to add more details of releases from my own weird collection. It is like a MB recruitment tool.

culinko · November 9, 2019, 4:40pm

There is PICARD-827 that talks about improving the Scan function for clusters

aerozol · November 10, 2019, 5:01am

That’s what I do

Cluster > Lookup (get correct match) > Save > Drag back to the left > Scan > Drag back into correct match that’s still sitting on the right > Submit

outsidecontext · November 10, 2019, 10:27am

The scan (or formerly called analyze) button has been there since the very early days of Picard, the first public release already included it. Back then using MusicIP instead of AcoustId, and this even dates further back to what is now known as the MusicBrainz Classic Tagger, the predecessor of Picard.

The reason why this does not fit into your workflow is that it was not made for your workflow. This button is meant as an alternative way for Lookup to perform an audio fingerprint based lookup. For this it takes unmatched files, does the audio fingerprinting and gives you the matched entries on the right. The fact that it is also generating the AcoustId tags is basically a side effect.

The way you describe it you are not using the scan button for this purpose but only to generate the AcoustIds. We have a feature request to allow more flexible AcoustId generation for already matched files:

tdiaz · November 19, 2019, 6:04am

How exactly Lookup and Scan work are still somewhat ambiguous to me.

If I select scan on a Cluster, I may get no reactions.
If I tip open the Cluster and select ALL the tracks, I may get no reactions.
If I tip open the Cluster and select SOME tracks, a SINGLE track even, I may get reactions.
…or I still may not.

… Pick some assorted files and it may react to it then.

The same goes for Scan, too.

Why does selecting the whole Cluster have different results with these?

It’s almost as if it has something to do with what may already be loaded on the Right, too. (Though that would be -really- off the wall … but… )

outsidecontext · November 19, 2019, 6:59am

Both do the same. Scan always acts on a file by file.

If scan finds something for a file it will do so no matter if you have only a single file selected or have this file selected as part of others.

It does not have different results for Scan, but for Lookup. See my explanation above: Lookup on a cluster searches for matching releases, lookup on single files searches for matching recordings. That’s actually the whole point of having clusters in the first place. If there would be no difference there would be no need to cluster.

tdiaz · November 19, 2019, 7:04am

Okay, … that’s what I’ve come to understand. But I could swear it’s not working that way… Maybe I’ve spent far too many hours clicking, picking and dropping tracks all over the desktop.