Picard 2.x and genres. Good practices? Advice on plugins?


#1

I am a bit confused about the current state of Picard 2.1 in regard to genres.

There is now a checkbox ‘Use genres from MusicBrainz’.
But using that doesn’t seem to produce any genre tags.
I this perhaps a feature that hasn’t been fully implemented yet?
Or are some plugins, scripts or other settings needed to get it to work?

And on a brother scale:
There is the ‘folksonomy tags’ option, there is the AcousticBrainz Mood-Genre plugin, the LastFM plugin, the Tango.info Adapter plugin, the ‘wikidata-genre’ plugin, and maybe a few more?

Could somebody knowledgeable and experienced on the matter perhaps give some advice on good practices, differences, and perhaps on the status/usability of these plugins?


edit:
Some of my own testing results are in, and I thought it may be useful to have them in the start post:

So I used some random 80 tracks as a test to see what the different options for writing genres would bring.
It’s a completely non-scientific test. It’s just a personal effort to get some idea on how to procede and what to use.

I tried:

  • Picard’s buit-in MusicBrainz genre option
  • AcousticBrainz Mood-Genre plugin
  • Last.fm plugin
  • Tango.info Adapter plugin
  • wikidata-genre plugin

I looked at the amount of tracks that received:

  • any tag at all.
  • tags that you would actually consider to be a genre.
  • tags that I would call useful keywords, such as avant-garde, 60’s, vocal, soundtrack, etc.
  • tags that are useless because they are completely subjective, commercial spam, four-letter cursing words, ranting, a repeat of just the artist name, etc. etc.
    For the purpose of this report, let’s call these ‘crap’ (complete rubbish added pollution)

.
Picard’s own ‘Use genres from Musicbrainz’ (Picard 2.1.1)

Only some 10% of the tracks received a tag.
They were very basic (jazz, rock, pop etc.) but roughly correct.
There were no keywords, and there was no crap.

  • Additionally checking ‘Use folksonomy tags as genre’:

Returns the same results, only with a couple of keywords added here and there.

note:
As @outsidecontext explained, there is currently an issue on the server side of MusicBrainz that is reponsable for these low scores.
I’ll try to update this post when there is development.

.
Acousticbrainz mood-genre plugin (1.1.1)

Very weird results. The majority of the 80 tracks I used were labeled such as:
“electronic; house; rhy; jaz”
“electronic; ambient; hip; jaz”
“electronic; trance; cla; blu”
etc.
I don’t know what to make of that, but I am pretty sure that Bob Marley and Elis Regina never made music in the “electronic” genre.

So it seems to be useless at the current state.

.
Last.fm plugin (0.8)

  • ‘Use track tags’ and ‘Use artist tags’ both unchecked:
    No results at all. So it’s probably not using something like ‘release genre’.

  • ‘Use track tags’ checked:
    80% received a tag
    of which 20% was genre, 35% keyword, 45% crap.

  • ‘Use artist tags’ checked:
    95% received a tag
    of which 30% was genre, 70% keyword, 2% crap.
    (isn’t it fun when you can use inconsistent percentages in your own research?)

Check both boxes, and what you imagine that happens, happens.

Setting the slider in the options panel to a higher percentage does help in restricting the amount of crap.
But it is also likely to filter out some useful tags.

.
Tango.info Adapter plugin (1.1)

No results. Zero, nada, nakkes.

note:
I read somewhere on MusicBrainz or Github that there currently is a known issue with this plugin.
I’ll try to update this post when there is development.

.
wikidata-genre plugin (1.2)

90% received a tag.
Of which 99% were actual genres.
1% keywords, no crap.

.
If anybody has considerably different results, please let me know. There’s always a chance I messed something up while testing.


#2

I can answer that bit. From what I understand that is a “Build it and they will come” feature. The genres have only just been added to MB itself, which means that not many have been filled in yet. So I guess this will be of more use in years to come…

These genres are not that obvious on the Release Group pages. Look under TAGS and you’ll find them. https://musicbrainz.org/release-group/f5093c06-23e3-404f-aeaa-40f72885ee3a/tags

All other Genre values rely on a plugin pulling the genre data from other websites.

I don’t any of the genre plugins as like you I don’t know what does or doesn’t work, so let my media centre handle that side instead. Which means I am also watching keenly for the responses here.


#3

So the ‘Use genres from MusicBrainz’ option in Picard itself is sourcing from a database that was implemented only recently with a completely clean slate?
No disrespect to it’s intention and it’s potential for the future, but for the purpose of getting genre tags populated it doesn’t seem a viable option at the moment then.

On to investigating the other available options…


#4

It’s a new feature, but not an empty data set. MusicBrainz’ genre support is basically a server side whitelist over the folksonomy tags feature that has been around for years, so there is plenty of data already.

The reason this is currently not working reliable is that it currently fetches only genres / folksonomy tags on the release level due to a server side bug. See Folksonomy tags no longer work as of v2.0.0 for details.

The difference is that they use different data sources. The built-in feature in Picard uses MusicBrainz, the plugins external data sources. There is no general rule what to use, but I can give you my opinion:

  1. Picard built-in Genre / Folksonomy Tags support: This would be my first choice, especially with the recent implementation of genre tags on MB server. So I hope the aforementioned bug gets fixed soon since it makes this feature less useful right now.

  2. Last.fm genre: This is very similar to the built-in support, but uses folksonomy tags from last.fm instead. Last.fm has been around for a long time and it’s tags are very comprehensive. So you usually get comprehensive results. But you need to keep an eye on the results since there are also non-genre tags you might want to blacklist.

  3. Wikidata genre: Regarding the data quality I think this is a great source. It’s more structured then the open folksonomy tag approach of Last.fm, so you usually get only actual relevant results. But the downside is that querying this data involves a lot of additional API calls, so enabling this plugin slows down Picard’s loading of albums noticable.

  4. AcousticBrainz: As the name suggests this uses AcousticBrainz as a source, which gets it’s genre data from analysis of the audio. An interesting approach, but IMHO the results are not too useful in practice.


#5

Thanks a lot @outsidecontext, those are very useful insights.
For what it’s worth, I started ‘a project’ running some 80 random tracks using all the different options and plugins to see what they bring.
Hopefully I’ll be able to report something on that in a while that makes some sense and that’s useful to others.

My first report and advice: don’t use Last.fm with ‘use track tags’ enabled. You don’t want the garbage that that will result in.


#6

I’m not sure I could come up with a blacklist that would work.
There’s just too much crap and a lot of keywords that I would have no use for.

I’ll probably stick to my current workflow for a while longer, which is:
I let no tagging software touch my ‘genre’ tag. Ever.
I do retrieve all sorts of tags from different sources, but they get written to a placeholder tag. (‘genre sourced’)
Once in a while I make it a project to fill in my ‘genre’ tags manually, and for that purpose my ‘genre sourced’ tags are a very helpful source to steer me to an adequate genre to actually use.


#7

Picard has an option to only use your own tags for saving the genre, so you could use that and “manually fill in ‘genres’” on MusicBrainz, thus also helping MusicBrainz out! (And saving it for the future in case you need to retag something “from scratch” at some point.)


#8

That’s certainly a good and interesting idea/suggestion.

Yet there is one problem I foresee in that.
I use the wikidata-genre plugin to retrieve their genre’s, which are quite good, and by means of a script I write them to a placeholder tag called “genre sourced”.
I would like to continue doing that, so your suggestion for me would have to be additional, not instead of.
At the moment I can’t imagine a solution to use both wikidata-genre, and simultaneously use the ‘Only use my genres’ option to retrieve genre tags that I have entered at MB myself.
Maybe it could be done by altering the wikidata-plugin somehow, but that’s way beyond my capabilities.


#9

I experimented with the different plugins and genre settings in Picard 2.1 to see if any actually worked a month ago. I found that the closest you can get to having good data was to use the last FM plugin. But you need to change the plugin settings to set genre by artist and set the percentage used to 100%. This worked fairly good. But still had errors sometimes like oldies instead of Rock & Roll. I finally just stuck with setting the correct genre while ripping and marked the tag as do not change in Picard.

Google Photos

Google Photos

Google Photos


#10

Thanks, that’s a good find that I overlooked, probably because it doesn’t clearly indicate what it’s for.
Do you understand, and could you explain what it does exactly?


#11

Not sure but it improved the data and almost eliminated the multiple genre tags. My guess is, that it is more of a ranking then a percentage. The more upvotes or same inputs move it up the scale.


#12

It limits tags to the most used ones, compared to the most used tag. For example you have the following tags assigned:

  • rock: 30 times
  • folk rock: 25 times
  • alternative: 4 times

The most used tag is rock, which was assigned 30 times. This is our maximum. The other tags get compared to this. So we get the following percentage values for the tags:

  • rock: 100%
  • folk rock: 83% (25 / 30)
  • alternative: 13% (4 / 30)

If you set the percentage to 70% Picard will use both rock and folk rock. Setting it to 100% usually means you will only get the most used tags (it could still be multiple, though).

I just checked the default values: The last.fm plugin sets this to 15%, which gives a preference of using more tags even if they seem to be off. Picard’s own tag support has a similar option which defaults to 90%. This prevents outliers from being included in tags. I think a higher value is actually a better default, unless your goal is to have as many genres listed as possible :slight_smile:


#13

Thanks for explaining @outsidecontext

I’ll experiment with that slider. Maybe it will solve getting tags for a certain female singer populated like this when using the Last.fm plugin:

“People I Want To Have Sex With; Singer-Songwriter; Feminine Cavern Of Love; Vaginal; Female Vocalists; Chanson; Groove; Nouvelle Vague; Indie; Mount Me And Ride Me Like A Pony; French; All; I Would Like To Spend An Afternoon Rubbing Her Breasts With Warm Mineral Oil; Female Vocalist”

I am certainly not an advocate for censorship. So if people spill their guts with garbage (and Last.fm has no adequate filtering system in place), it’s on them and I can handle it.
But it might be a good idea to explain this function a bit in Picards’s setting panel for the Last.fm plugin?


#14

I’ve tried the Wikidata genre plugin, but it uses “/” as a separator instead of " / " (space, slash, space), which is what I specified in the Metadata > Genres > Join multiple genres with setting.

How can I fix this, or get genres separated by " / "?

The issue is that this format is incompatible with my media server (Kodi).


#15

The current version of the plugin does not join genres at all, they are handled as a multi-value tag. If you have Picard configured for use of ID3 v2.3 the tags will be joined with the delimiter configured at Options > Tags > “Join multiple ID3v2.3 tags with”. This setting defaults to /, but you can change it.

If you are interested @GrokMan is working on an improved Wikidata plugin that offers some configuration option, including setting the delimiter.


#16

Thanks. I do have Picard configured for use of ID3 v2.3. Does this only apply to tag info from the MusicBrainz database?

I am interested in an improved Wikidata plugin. Is there some way I can help?


#17

This example is horrible, it’s actually a good argument for a whitelist approach. The discussion about black vs white listing of genres came up years ago, but nobody did the required work. But now that we have that server side whitelist I’m thinking about using it for other genre plugins as well, especially Last.fm. At least it could be a reasonable default.

But increasing that “minimal tag usage” percentage to 90% or so should also help much, and we should really change the default here.


#18

Yes, you can test it and give feedback. I packaged the updated plugin up and upĺoaded it to https://transfer.sh/AfxJc/wikidata.zip

Just place this ZIP file in Picard’s plugin folder. You can do this from inside Picard using Options > Plugins > Install plugin… . Make sure that the plugin is enabled. After first install / activation you will have to close the options and open them again, then you will have additional settings under Options > Plugins > wikidata-genre


#19

This garbage example was a bit extreme and rare, and only present when using Last.FM’s artist tag for this specific performer.
Using track tags didn’t show it.

After doing some tests with different percentages it looks like the default 15% setting isn’t that bad.
You can surely filter out some garbage by setting it higher, but you will also quickly loose a lot of possibly useful tags.
And even with a setting of 80% it shows that somebody labeled Aphex Twin as “F…g Wh…e”.

==

The option to use whitelisting for genres seems a good approach to me.

This is probably too far fetched and complicated to make it a reality, but since we are sharing ideas here:

One problem that I see is that all sorts of words and descriptions are usually thrown in one big pile, and that pile is then called ‘genres’.
But of course descriptions such as, avant-garde, 60’s, Top40, Soundtrack etc. are not genres but keywords or descriptions.

Ideally I would use:

  • A ‘genre’ tag that contains actual genres. (strict)
  • An additional tag that contains subgenres. (strict)
  • A tag for style/form, such as A capella, Duet, Industrial, etc. (whitelisted)
  • A tag for keywords that people might consider useful, such as Top40, Stadium Rock, Soundtrack, Piano concerto, etc. (loose, and possibly blacklisted to prevent profanity)

I am using something like this myself already, using a list I created a while back. (it does probably need some updating)
You can find it here, perhaps it’s useful as some starting point or to fuel the discussion:

edit:
I see that sites such as AllMusic and Discogs are doing something like this already:


#20

I’ve been thinking of creating a genre filter plugin that adds a set of script functions to filter the genre tags.

Wikidata has a good list of genre and can have a list of parent genre’s.
See http://tinyurl.com/y7vkp8yr as an example query.

The idea would be to run this query once and save this for the next run.

The easiest script would be to filter against the list.

Another thing I was thinking of is try and simplify the list of genre’s and replace a sub-sub-sub genre with a parent genre. As we have parent genre’s you could find an item in the list and find the shortest path to a list of basic genre’s such as rock and use the parents instead.