"Missing MusicBrainz Data", but I know it's all in MusicBrainz

so, after taking a look at my missing data page, I noticed/remembered that practically all the tracks in my list exist in MusicBrainz, since I’ve made sure all the tracks in my library are linked to MusicBrainz releases and whatnot. I mean, there’s a couple in the missing data list that might not, but the vast majority do.

I will note that a good bit of these are in my library with US English localized names (using Picard’s “Translate artist names…” function), which might mess around with all the Japanese music I’ve got in my library… either way, the artists in question do have proper aliases.

are there plans to eventually allow users to manually connect listens to recordings?

5 Likes

Hi!

Can you share a few of the tracks from your page with the link of the MB recording they should have been linked to? I can take a look and see if something is broken.

That said, I already have some guess as to why those tracks are unmatched. We currently use transliteration in some steps of the matching process. It usually works well but for there are some corner cases where it fails. For instance, there are some characters which exist in both Chinese and Japanese language but have different transliteration. To quote an example from a library we use,

The character “明” is converted “Ming” in Chinese. “明日” is converted “Ashita” but single charactor “明” will be converted “Mei” in Japanese.

While doing the matching, we do not know which language the artist/track names are in so in some cases the transliteration comes out to be wrong. So, no match.

I am unsure how Picard’s translate artist name function works but my hunch is at some point there is a wrong transliteration in Picard or in LB side that causes no match.

We have discussed manually connecting listens to recordings earlier but there are multiple technical considerations which make it hard to implement. Therefore, the intent is to try to make other improvements to the matching process first.

3 Likes

Hi!
I looked into this. There are two issues here, the first recording doesn’t match because it is a standalone recording and we currently don’t consider those during matching (if we add those in future then this recording will match your listens).

The second one doesn’t match because the artist credit in MB is different from that on the listen. The MB one has Massaka at the start where it occurs somewhere in the middle in your listens. I am not sure we can do much in this case.

1 Like

Does Listenbrainz not consider the submitted MBIDs at all? I know for a fact that some listens recorded on that Missing Data page for me were submitted with MBIDs.

2 Likes

AFAIK with submitted MBIDs the link is established right away. That part at least already worked that way before this whole auto-matching got implemented. So might be interesting to inspect the listens in question in detail, maybe the MBIDs were not submitted or not submitted in the expected way?

2 Likes

You can see the listens here: Missing MusicBrainz Data of "elomatreb" - ListenBrainz - but I just tried again with some of those entries and it matched properly, so it might have just been a glitch somewhere :person_shrugging:

1 Like

Hi!

ListenBrainz doesn’t consider submitted MBIDs during matching process. So it tries to find MBIDs matches for each listen. But the submitted MBIDs are considered in most other parts of ListenBrainz. For instance, displaying links to the user on the website, submitting feedback, pinning recordings etc. If a listen had MBIDs submitted by the user and the matching process also assigns it a match. The ones submitted by the user are given a preference.

Currently the only exception to this is statistics which do not consider user submitted MBIDs, that’s mostly to avoid duplicates in statistics calculation. The Missing MB data page also uses the data used for statistics which is why listens submitted with MBIDs appear on that page for you. I’ll look into fixing this.

So as @outsidecontext mentioned the submitted MBIDs are linked ASAP but we intentionally ignore those in some cases (and that resulted in buggy UX which you saw earlier).

2 Likes

Hi!

Can you some listens in that list that you know exist in MB with a link to the relevant MB recording page? I can then look into it and try to figure out why it didn’t get matched.

Can you explain what you mean by “but I just tried again with some of those entries and it matched properly” ?

Did you submit new listens of those recordings? In that case, ListenBrainz will periodically recheck those listens again for a match (we are currently experimenting with this, in case you are interested I can share more details about this). If a match is found, it’ll get applied to all listens of that recording retrospectively for all users. The Missing MB data page is not updated at that time however, it’ll take 1-7 days to refresh.

2 Likes

I just listened to the files in questions again, and it showed up with a link to the recording in my recent listens.

In order of the page link above:

1 Like

the first 4 recordings from my current list:

a lot of tracks from these releases in particular are included as well:


Translate artist name in Picard uses the MusicBrainz artist’s alias for whichever region you select as the artist/release artist name, so it should show up in a search…

for example, the Trigun soundtrack is by 今堀恒雄, and Tsuneo Imahori is the primary English alias for that artist.

edit: I’m actually pretty sure Picard only uses aliases, I don’t think there’s transliteration functionality built-in. I can test once I get back to my PC, if desired~

Fascinating, I didn’t actually know we had a good auto-linking system in place, because I only have two methods of submissions:

  • A good plugin that submits everything from the listen and links 100% of the time.
  • And the ‘Web Scrobbler’ plugin which I use to scrobble from Bandcamp and is probably missing something that stops them from being linked? Track times?

Because all almost those listens are in the DB exactly as credited and they are never linked.

I can have a dig as to why, but could we include some more info in the missing data page header/description? Or a link to somewhere with info? I’m still so short on time otherwise I’d draft something, but your post has some interesting tidbits in it already!

Yes, it would be interesting if every entry on the missing entries page and other unlinked listens in your profile contained information about all the data that was submitted to Listenbrainz, so people can troubleshoot why things weren’t linked.

Slightly related, it would also be very nice to have a feature to manually link listens to recordings in MusicBrainz, so that link can be used in the future for automatic linking.

Hi!

I took a look at these listens and recordings. Those are not matching because the submitted listen’s artist name is set to track artist of the recording’s appearance on some release. The recording artist credit and the track artist name differ a lot (for the first 3, the two artist’s names are interchanged between the two and for the last two the artists are entirely different). I am not sure what’s the best way to fix this issue, will discuss with @rob.

2 Likes

Hi!
I looked into these as well and the reason for these recordings not matching is again that the track artist is given as the artist name in the listen and it differs significantly from the artist name on the recording.

What is the reason for not using the artist MBID to solve this problem? It seems to be ignored on purpose, but it seems like such a logical and straightforward solution. What am I missing?

Hi!

The artist mbid is not ignored entirely by ListenBrainz. The mapping is intentionally kept different from the submitted data. Most of the features in ListenBrainz use user submitted mbids if available and fall back to mapped ones.

Currently, only the statistics reports don’t consider the user submitted mbids and that is to avoid duplicates in stats. This page is built from the same data so it shows some entries that had mbids submitted by users (but those are still used in other parts of the website).

Also, some of the listens in question actually don’t have mbids submitted by the user.