Search results improvements

Tags: #<Tag:0x00007fbc7fd508e8>

Okay, I think I have finally managed to come to a good compromise between it all without a custom sort feature.
Please check the results now -
Riot

Fred

Bach

Pink

Tchaikovsky

3 Likes

Great, that looks pretty good, glad my suggestion helped you.
Is Tchaikovsky still a bit low because it is only an alias, not sure why that would make such a difference, or is there some other reason ?

It’s because it’s an alias. I am working on a PR to improve boosts for primary aliases. It is a bit more work since we need to change indexer for that.

Would it be possible to take a user’s preferred locales into account? Like the way MusicBrainz chooses which Wikipedia abstract to use on entity pages, a German transliteration would get a higher boost if the user is a German native speaker.

Probably opening a big ol’ can of worms here. :slight_smile:

The search looks amazing now! As for

I’m wondering, what about boosting only primary latin alias for artists written in a different script? Would that be better than boosting every primary alias?

Edit: I just realized that artist names nor aliases have a script property :astonished:

Technically possible, but it isn’t feasible with our current resources.

1 Like

Why would we do that?
Only when the search term is in Latin script, you mean?

https://tickets.metabrainz.org/browse/MBS-5192

Now I am going to sound like a douche because I am about to complain about the very thing I was asking for…

Chris McDaniel gives me over 12,000 results.
https://musicbrainz.org/search?query=chris+mcdaniel&type=artist&method=indexed
While Christopher McDaniel gives me 1,600 results
https://musicbrainz.org/search?query=christopher+mcdaniel&type=artist&method=indexed
And “Chris McDaniel” gives me 1 result
https://musicbrainz.org/search?query=“chris+mcdaniel”&type=artist&method=indexed

I would like more than one (you may now get two, because I added an alias to another entry), but the 12,000 is too much.

Chris should cover Chris, Christopher, Christine, Christy, etc.
And McDaniel should cover MacDaniel, McDaniels… and maybe even to go so far as McDonald

But Leslie McDaniel (found on the first page of results) seems to much of a stretch.
And I have no clue why Fugitive (on page 497) is included. It looks like maybe because they have a disambiguation that includes the word Chris?

That’s the “happy medium” I was talking about. Somewhere between 1 exact result and 12,000 unrelated results.

Its an OR search i.e return results that contain Chris or McDaniel

Why does it matter if there are 12,000 results as long as the results on the first page are the best matches ?

True. Absolutely.
No argument with that. But let’s take it a step further…

.

Christopher Max, who’s real name is Christopher McDaniels (was already an alias), did not show up in a search for “Chris McDaniel” until I just now added Chris McDaniel as an alias.

I admit, he probably showed up in the 12,000 results (I did not look). But, in my head, when I see 12,000 very loosely related results, I am not going to wonder why Christopher Max shows up in a Chris McDaniel search. And would end up being ignored like the other 11,999.

But on the opposite end, when I get the 2 results (such as when using quotes), I am going to wonder why Christopher Max showed up, and I am going to open it to see if this is the guy I am looking for.

.

To me, I am just thinking that Christopher McDaniels is close enough to Chris McDaniel that is should be included in a search (without that specific being alias added), but without the 12,000 other results watering it down.

.

Please note: Yes, I am aware that quotation marks in search yields different results. So, for quotation marks, I do expect fewer results.

But I would like to see the non-quotation mark searches narrowed from the very broad 12,000, while not being so narrow as to only include the 2 results.
The first 5 pages do not include anyone named McDaniels or named Christopher unless it is added as an alias. I think Christopher McDaniels is definitely close enough to Chris McDaniel to be included in results.

https://beta.musicbrainz.org/search?query=chris*+AND+mcdaniel*&type=artist&limit=25&method=advanced

tadaa, the search you are asking for is "something that starts with “Chris” AND something that starts with “McDaniel” " This is how you tell the search this explicitly.

Another search to illustrate what I mean more clearly https://beta.musicbrainz.org/search?query=chris*+AND+mc*&type=artist&limit=25&method=advanced

2 Likes

I recently added primary_alias as a specific boosted field and re-adjusted the weights.

The new results are as follows

Bach

Justin

Fred

Riot

Tchaikovsky

Pink

9 Likes

@samj1912: thanks for your hard work on the search, the results are much more accurate now for classical

One problem I just noticed though: I’m searching for the related work for a recording titled:

Widmung ("Dedication")

and the search returns “()” and japanese titles first. No idea how the old search was behaving in that case.

You can see how the old search would have responded on the test server. The first pages are more or less the same as on the new search without quotes. Results like this are to be expected since Widmung (Liebeslied), S. 566 has no aliases currently. (I guess that’s the work you wanted to search for.)

This still isn’t the expected behavior however. Using a search like ("Widmung Dedication") explains what’s going on really well, I think:
new_search_quote_test

2 Likes

I just searched Brooklyn In The Summer Aloe Blacc in recordings. And it doesn’t return Brooklyn In The Summer
by Aloe Blacc on the first page of results, even though the recording is on MB. I almost added until I just checked uner Aloe Blacc and sure enough, the single with that recording is there.

https://musicbrainz.org/search?query=Brooklyn+aloe+blecc&type=recording&limit=25&method=indexed

Just a FYI, search results are throttled with the max response time limit being 3 seconds. Queries that take more time return partial results. Which is part of the reason that results for queries returning a large number of matches, especially for recordings, might return partial results.

3 Likes

I’ve reported these bugs already but haven’t got any response and the bugs still exist.

  1. Compare results of 2 searches:

https://test.musicbrainz.org/search?limit=100&method=advanced&page=1&query=date%3A2018-06-30+AND+status%3Aofficial&type=release

The old search returns 22 results.

https://musicbrainz.org/search?limit=100&method=advanced&page=1&query=date%3A2018-06-30+AND+status%3Aofficial&type=release

The new search returns 668 results. The reason is that results of the new search include all releases where firm release date is unknown and specified as “2018”. Why are those releases are included when searching by specific release date? This doesn’t make any sense to me.

  1. Click on page 2 in the new search results. Notice an artist displayed on the top. Click to page 1 and back to page 2 (you may need to do it more than 1 time). Notice that an artist displayed on top of the page 2 is different now. This behavior causes the following problem: when user (or application) tries retrieving all records returned for certain search criteria page by page, some records will be returned twice, some records won’t be returned at all.

@samj1912 Can you look into this? My application depends on this type of search and fixing these 2 bugs is really important to me.

3 Likes

Sorry for joining the topic a year later, but I believe this is the right place for my question.

I want to add the actor John Moffatt as AC, but if I type only until the last “T”, I get the results from the left until I complete his name:

Is it expected behavior to show all artists with the family name Moffat before the much better match with a slightly different family name Moffatt? Should I add the (wrong) variants as search improvement aliases or is this a general issue that should be addressed?

4 Likes

Why don’t we have both?

I would both add an alias and also add a ticket on our ticket tracker. :slight_smile:

2 Likes