Search results improvements

With the recent launch of Solr to beta (read all about the improvements here - MB Search Overhaul) , I received a lot of tickets on the Solr ticket tracker on why a particular search doesn’t return results as expected.

Since I am expecting a lot of these, please remember we cannot possibly cover every case. But, what we can do is try to fit our search config to perform as good as possible, as such I am creating this thread as a one stop discussion/request thread to improve upon boosts.

To help, please post your search improvement requests in the following manner -

  1. Link to the beta site search with the query (To see Solr’s results)

  2. Link to the main site search with the query (To see our old search servers results)

  3. Expected result.

  4. Explanation to why you think your expected result should have a better score. ( Answering questions like - is your query in the entity’s alias? Its comment? Its sort-name?)

The above format although not strictly necessary will be really useful for me as I can debug things a lot faster and improve upon our search results.

@reosarevok and @Freso will be helping me sort out any editing specific doubts and help me bridge the gap between the search server code and how you as an editor expect it to work.

Pinning this thread for visibility.

10 Likes

The issues from SOLR-86:

The search results for works are less relevant on the beta server with the new SOLR search.
When searching for “aria Final Fantasy VI” I want this work, but instead I get a list of works that are far from what I am looking for.
Compare the results:
Search results - MusicBrainz
Search results - MusicBrainz
The way this is now on the beta I’m less likely to find the work I’m looking for. In this case I am counting on the search to look at the disambiguation as well.

1 Like

There is no aria final fantasy vi in aliases.
There is aria alias but final fantasy vi is a parent work.
You would like parent work titles and aliases be taken into account in work search too?

I like it that the results are now more specific to the search, personally, it is why I used to use the direct search often.

4 Likes

I think searching parent works would be a boon, it would solve the problem nicely.

Does new search not do partial matching for aliases? “rahva” used to find https://beta.musicbrainz.org/artist/9be7f096-97ec-4615-8957-8d40b5dcbc41/aliases but it no longer does.

1 Like

I just searched artist index for fred _and was a bit surprised it return _Fred Astaire, Fred Frith, Fred de Fred and Fred Steiner before any artists just called Fred

https://musicbrainz.org/search?query=fred&type=artist&method=indexed

I then tried an advanced search just on artist:fred and the first first_ Fred_ didn’t come until bottom of the page.

https://musicbrainz.org/search?query=artist%3Afred&type=artist&limit=25&method=advanced

I know it was decided to boost popular artists over less popular ones (which doesnt seem very musicbrainz which is why I never did it), but shouldn’t exact name matches should come before partial matches (a search for Fred is a better match to Fred then to Fred Astaire)

4 Likes

I’d expect Search Results - MusicBrainz to find Maria Drozdova - MusicBrainz (same surname, only one letter difference - turns out it’s the same artist too).

It might work now that I’ve used her on an AC, not sure - it definitely didn’t work earlier though, 0 results.

Searching for “XXme” doesn’t find an artist entered as “XX:me”. Not really sure if that’s a major bug or something easy to fix, but I imagine that kind of behaviour will fool people into believing that the thing they’re searching for doesn’t exist in the database.

Also, this should probably be updated at some point.

Hi,

not sure if this is SOLRs fault, but:
Search for releases where script is not set
does not seem to work anymore
Same query on mb-beta
is the same
These queries are used on the statistics page.

Searchable via https://musicbrainz.org/search?query=-script%3A*&type=release&limit=25&method=advanced

We will be switching to a consistent syntax as above to find all unset fields.

2 Likes

Fixed the stats page here - https://github.com/metabrainz/musicbrainz-server/pull/690

While I felt the old search was a bit odd to use, giving me dozens (sometimes hundreds) of unrelated results, including results that I can’t even figure out how or why they could be included…
The new search could maybe be expanded a little. For example:
I searched confederate railroad, but I misspelled it. My search, confederate railrold, gave me zero results.

While zero results is technically correct, a loosening of the criteria could be useful. Especially when considering simple name variations (john vs jon or smith vs smithe).

5 Likes

I’ve noticed 2 bugs with the new search system.

  1. When searching by specific release date, search results include releases where only year is specified in release date field. Previously such releases were not included (and I believe they shouldn’t be). Example of query: https://musicbrainz.org/search?limit=100&method=advanced&page=1&query=date%3A2018-06-30+AND+status%3Aofficial&type=release

  2. Something is wrong is paging. If you run the query above and try to switch from page 1 to page 2 and back several times, you will see that content of each page is not permanent. Or you can simply refresh the page several times - in 50% cases it will display different records.

1 Like

Have you got some tests for this new search. If not you really need some otherwise you’ll find that as you try and improve things for one type of search you will inadvertently make the results worse for another search.

If I search for Chris de Burgh I get 60+ hits. If I search for Chris deBurgh I get one hit (the correct one).

1 Like

I am not sure what’s wrong with giving correct results? The de burgh search finds more results simply because of the fact that those results match the terms. The correct result is still number 1 either way.

1 Like

The de Burgh search appears to return any name which includes a portion of my search string - unless I enclose it in double inverted commas. That’s probably what it is programmed to do. I have no problem with that.

What I found strange was that it gave me just the correct name when I searched for what was in effect an incorrect name.

My local tax bill hasn’t been paid for years. They keep sending the tax bill to the wrong person. A simple misspelling of my name voids my obligation to pay them. Correct address. Phonetically acceptable version of my name. But it is spelled wrong.

Correct results are awesome.
But a simple spelling discrepancy between de burgh and deburgh shouldn’t be excluding other spelling discrepancies.

Think about it; if I search the spelling included on a news article, and I get no results. My first instinct is not to try other spellings since I am reading a news article that has that spelling. I am going to add a new artist entry.

But, if I got a list of results that are not exact, but are at least close, my first instinct is to open them to see if there could be a misspelling - especially if some of the other information is filled in (disambiguation, dates, places, etc).

3 Likes

In my ideal world, at Big Rock Candy Mountain, I can set a sliding level switch anywhere from “exact search term” to “very very fuzzy search term” and have the results morph in front of my eyes.

1 Like