Trying to standardize genres/other tags

Context: Cause I hate working with strings when you can use an array

Problem:
I’d like to make a script to make tags looks all the same, try to standardize how they are spelled, because when searching for “Steampunk”, most (?) apps/programs won’t return “Steam Punk”. The same goes for “Dnb”, which you can write as “Drum’n’Bass”, “Drum & Bass”, “Drum and Bass”… Yikes.

My problem is that I need to go through the genre variable, check if one of the following is present, delete it and add the new genre. There is no simple way to verify if a value is present and delete it.

Ebm             => EBM
Edm             => EDM
Dnb             => Drum and Bass
Darkwave        => Dark Wave
Steam Punk      => Steampunk
Shadowrap       => Shadow Rap
Pop/Rock        => Pop Rock
Alt. Rock       => Alternative Rock (Saw a lot of those on MB)

(Still trying to figure out when or when not to put a space between the two, even the official MB genre list is confused with this :stuck_out_tongue: )
the multi variables are missing some useful functions.

For example, with my tired brain, I thought about the following:
$if($delmulti(genre,Dnb),$addmulti(genre,Drum and Bass))
Which may of may not be feasable/optimal and what not but you can discuss this that too :smiley:

delmulti could be used alone, but could also return a true/false depending if the value was indeed deleted, so we can easily use that function as a condition too.

I’m thinking that this could also use a multi to look for:
$if($delmulti(genre,Dnb; Drum'n'bass),$addmulti(genre,Drum and Bass))

Now, maybe this is niche or already being worked on. I may have overlooked use cases or potential “why this isn’t a good idea”, hence this post.

I would appreciate your inputs on this matter, and if you have any tips, etc.

EDIT: Might also use a replacemulti function of some sort.

:beers: Cheers

1 Like

This is an issue that takes a toll on my efforts to ‘get genres right’ too.
As far as I can think of at this spur of the moment what would be needed would be a comprehensive list that contains a ‘preferred’ spelling of acknowledged genres, with a second level that contains alternate spellings that exist ‘in the wild’ for each of them.
I have such a list myself, but of course it is subjective.

Just curious, I am guessing you want the results to end up in a music player? What player is that?

edit:
B.t.w. I don’t think Steampunk is a genre. Maybe a style, or maybe a keyword/description.

In my opinion MusicBrainz at this moment in time is pretty rigid and un-creative where it concerns genres. (or something such as ‘styles’)
Other sites such as RYM and Discogs are much more active and open to discussion and progress, but to be honest, I don’t see anything close to a holy grail there either.

1 Like

What about something like:

$setmulti(genre,$map(%genre%,$if($lower(%_loop_value%,ebm),EBM,%_loop_value%)))

This uses currently available scripting functions and has the benefit of performing a case-insensitive comparison.

1 Like

This is an issue that takes a toll on my efforts to ‘get genres right’ too.

Very glad I’m not the only one losing sleep over this :joy:

What player is that?

I’m mainly using Jellyfin ATM. Gelli on android. I also have a Navidrome instance running cause it’s very fast and light weight. I got tired of desktop music player if I’m being honest.

B.t.w. I don’t think Steampunk is a genre. Maybe a style, or maybe a keyword/description.

For the Steampunk “genre”, yeah it’s mostly a style, and I must admit, although I love music I’m not too “geek” about it (in the sense that I don’t know much). I love tagging that as it makes it easier to find similar songs/albums in my library, mainly because when you self-host you don’t have the power of machine-learning and human curated playlists from what other people listened/are listening to like Spotify have for example. Having more “genre” makes it also easier to make smart playlists :smiley: (once Jellyfin implement those)

If you care to share such a list, although I agree with you it is subjective, it would make my life so much easier than trying to compile one from scratch. I’m planning on copying the “approved” list of genre from MB and add what it might be spelled like to it can be unified. (Making sure both upper case and spaces are ok)

I do visit last.fm to get some inspiration of “style/genre” on some albums I find lacking in tags (tags here refer to the list of genres), again to make it easier to find.

On the side, also regarding genre/style:
My main frustration is coming from MB where you can tag artists, release group, album and specific song, but no way to easily edit those.

For example, I saw a 06 tag on multiple songs from 1 specific release.
I also had to copy multiple genres from an artist to like 20 release groups so Picard would use them. And then go through all the releases because someone tagged Punk on a hip hop artist that doesn’t do punk at all - although this might also be subjective but the consensus across multiple sources was hip hop/trap/emo rap.

1 Like

I created a plugin that implements a $replacemulti script function:

After enabling the plugin you can add a script (Options > Scripting):

$replacemulti(genre,Ebm,EBM)

Not extensively tested but it works for me!

2 Likes

Thanks for the base, I had an error, this seems to work:

$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),ebm),EBM,%_loop_value%)))
$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),edm),EDM,%_loop_value%)))

$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),dnb),Drum and Bass,%_loop_value%)))
$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),drum'n'bass),Drum and Bass,%_loop_value%)))
$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),drum & bass),Drum and Bass,%_loop_value%)))

$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),darkwave),Dark Wave,%_loop_value%)))
$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),steampunk),Steam Punk,%_loop_value%)))
$setmulti(genre,$map(%genre%,$if($eq($lower(%_loop_value%),shadowrap),Shadow Rap,%_loop_value%)))

Although it look like a mess, it seems to do the job.

I am confused.

In MusicBrainz, Genres are a well defined, finite subset of tags - the list can be found here. @hiccup steampunk definitely is a genre - it is on this definitive list - unless of course you are questioning the sanity oft he MB genre wizards?

The tags that you enter are free-form, and so I assume that genre identification from tags is case insensitive. So I can see that all of (ebm, EBM, Ebm, eBm, ebM, EBm, EbM and eBM) would be considered the same genre. But steampunk is a genre, and (unless spaces are ignored), “Steam Punk” would not seem to be a genre.

Of course, you get a lot of people who enter freeform tags thinking that they are entering a defined genre, but they enter Steam Punk rather than steampunk and it ends up a tag instead. This is the problem with tags and the kludge MB implementation of genres as a sub-set of this - and the result is a data clean-up nightmare.

This then gets infinitely more complex when you consider sources from the genre tag other than MB’s genres. (lastFM, AcoustBrainz Mood-Genre plugins immediately come to mind).

So, if this is a discussion about a data issue, then it’s complex.

But if this is a pure scripting question, then much simpler. And (as usual) @rdswift or Bob is the font of all knowledge. :slight_smile:

So I can see that all of (ebm, EBM, Ebm, eBm, ebM, EBm, EbM and eBM) would be considered the same genre.

I guess it depends on how the player/server implemented their genre etching script thing hahaha but indeed it is the same, but for sanity it’s nice to see them all written the same way all the time.

And I do like custom genre (tags) too as it can sometimes better define what it is, but writting the year of the release as a genre (on a public database I mean) or writing the album name as a genre (yeah saw that too) is pollution in my eyes - although this topic won’t fix that

I might do an extensive banned genres in Picard (like do all the years and stuff) and use either @atj or @rdswift solution and make a huge script just trying to fix the genre (to my subjective liking trying to also respect the MB official list)

EDIT:
I hate that some times it’s 2 words while sometimes it’s a single word: Shadowrap, Steampunk but it’s Dark Wave and not darkwave:confused:

For something like this, I’d likely use a regular expression like ^drum.*bass$ (with appropriate escaping for Picard) in an $rsearch() function rather than a bunch of discrete tests.

1 Like

I unpack the genre mess like this:

  1. Broad genres have general consensus (we can probably mostly agree about Classical or Jazz for example). Fine genres are considered much more subjective.

  2. Genres are IMO hierarchical not exclusive. Baroque or symphony or chamber-music would generally be considered sub-genres of classical for example.

  3. Whilst there is a place for free-form tags, is is absolutely inevitable that they get messy and ruin data quality. If you don’t want genres to be messy, you have to have a defined list and prevent typos - so selected from a dropdown, not entered freeform where speeling mistooks oftenn happon.

  4. If you want consistent data quality, then it needs to be curated - i.e. you need to treat major Genre like Album type - it is a required field, selected from a defined list, and subject to voting.

That is not to say that there isn’t a place for freeform tags (which you accept from the start will become messy - but when you give a scribble pad to a pre-school child, you know that will be the case, but you do it because it allows them to express themselves.

The Picard lastFmPlus plugin did a pretty good job of cleaning up genres from lastFm, so if you want to know how a plugin can do this, that is where I would start.

I have also suggested that Picard provides better support for:
a. Multiple genre source plugins - functionality to handle multiple sources adding to the genre list
b. Genre sanitising plugins - an event to call a plugin that does this lastFmPlus style clean-up.

The current genre list seems somewhat arbitrary. There is no edit history, so who knows how some of them got on the list. There are no descriptions for them - can someone please tell me exactly what “blackgaze” or “brostep” are, and what the differences are between “conscious hip pop” and “underground hip pop” or “contemporary classical” and “modern classical” or even (which I thought were the same) “punk” and “punk rock”? There is no hierarchy, so some have “Romantic classical” items have “Classical” as genre, and some don’t, and “Classical” cannot be implied. As for spaces, don’t get me started on “*punk*” where sometimes they are one word, sometimes two with a space, sometimes two with a hyphen. And the current implementation does not provide support for aliases, so you cannot alias “Steampunk” to “Steam Punk” etc. Aaaaaaaaaaaaaarrrrrrrrrggggggggggggghhhhhhhhhhhhhh!!!

1 Like

“brostep” :joy: funny you mention this as I saw this for the first time yesterday… it killed me.

I always imagined a tree with all genres and sub-genre (and sub sub genre) where you just select what applies and then picard could tag your music depending on the level of details you prefer.

I totally forgot about that one as I tried the last.fm plugin bundled up with Picard itself. I thought the plus one was abandoned. Or was it renamed? If not, do you mind sending the link my way as I can only seem to find open PRs about it but no download :confused: (I apologize for my incompetence)

https://rateyourmusic.com/genre/brostep

RateYourMusic (RYM, going to be renamed to Sonemic) is my place to go when I have doubts on ‘genres’ or want to learn more.
All their validated genres have a description, links to the hierarchy they belong to, all releases that are tagged with that genre, and they show known alternate spellings.

https://rateyourmusic.com/genre/Drum+and+Bass/

2 Likes

In my opinion it is something stylistic.
It is valid for movies, video-clips, images, designs of machines and devices, and for dressing-up in costumes.
I have no idea how steampunk music is supposed to sound.

1 Like

I agree entirely. But then again I have no idea what most of these genres sound like and no definitions to help me. It appears to be an arbitrary list of styles, produced by a secret cabal of elite genre definers.

Perhaps MB should adopt the RateYourMusic genre hierarchy and descriptions or partner with them to improve how MB deals with Genres and to share data.

1 Like

I think it never got converted to v2. It may be in the repo under the 1.0 branch.

The list is not that bad. As a start.
It only has very few entries that shouldn’t be there, but most are perfectly o.k.
It is very incomplete though.

The problem as I see it is that genre discussions seem to be happening in chats (which is a very bad platform for things like this), and/or at the tickets platform, which most users will be oblivious to.

A forum board would be much better suited for discussing and suggesting genres.
But, it would probably also need severe moderation. Most of the forums that discuss genres quickly get out of control.

  1. It has entries that shouldn’t be there.
  2. It is missing entries that should be there.
  3. There is no history showing who decided what was included or why entries are on the list.

As I suggested, it appears to be arbitrary.

If RateYourMusic has already done a lot of the work to classify genres, why reinvent the wheel? MB should partner with RateYourMusic and reuse their data, and then the discussions about genre can happen on the RateYourMusic forums.

That may not be a bad idea.

For those interested, there is a list of the genres that they have validated here:

But instead of just copying and pasting it without their knowledge or consent, it would probably be more decent and respectful to reach out to them.

But that won’t address what the OP is raising here.
It doesn’t include alternate spellings or names for genres.

(sorry for being part of slightly derailing your thread @sickwolf :wink: )

I really don’t mind! I’m glad some are talking about the actual source of the problem :slight_smile:
RYM looks awesome, love the hierarchy.

I totally agree, but at the same time it make sense to the people using X or Y genre as a descriptive term - like I use steampunk to refer to Abney Park music style and it is arbitrary, but to me it make sense. I won’t force it on anyone though and you might prefer other genre to describe it.

We human love tagging stuff and inventing names. (I say genre but you may refer them as tags, I use genre loosely here don’t throw stones)

I’m gonna go wild here:
Can we actually make a hierarchy/tree of genre and change how we input them in MB from a thread like this? I have no idea how this would work, etc. But like - can we actually change things? :angel: