Similarity/synonyms/merging for folksonomy tags?

merging
folksonomy
synonyms
similarity
tagging
musicbrainz
Tags: #<Tag:0x00007fe3d01ffef8> #<Tag:0x00007fe3d01ffdb8> #<Tag:0x00007fe3d01ffb10> #<Tag:0x00007fe3d01ff750> #<Tag:0x00007fe3d01ff4f8> #<Tag:0x00007fe3d01ff2f0>

#1

There are some sets of tags that are basically duplicates. For example:

https://musicbrainz.org/tag/drum%20and%20bass
https://musicbrainz.org/tag/drum%20n%20bass
https://musicbrainz.org/tag/dnb

It would be good to be able to notate such tags as synonyms, perhaps as a Tag-Tag relationship. It would also be good if tagging software could grab tags with a “total synonym count”, instead of just the tag count.

It would also be good to be able to mark one tag as the master tag of a group of perfect synonyms (likely what ever Wikipedia uses - “Drum and bass”, in this case), so that software knows which tag to use when tagging.


#2

@CallerNo6, you mentioned tag similarity, did you have any good examples of non-perfect synonyms, or methods for marking synonyms?


#3

I’m fairly sure some kind of aliasing is considered for the “tags as genres” work going on behind the scenes. Software can always implement their own mappings in the meantime (and some have).

Note that «Tag-Tag relationship»s are not possible to do, due to folksonomy tags not being entities. Aliases and Genre-Genre relationships are exactly some of reasons I have voice support for “genres as entities” a few times, but that will likely not happen for a while.

(Also, keep in mind that the meanings of the tag set of any individual user is entirely up to that user. «dnb» could well be used for “Deutsche Nationalbibliothek” (or similar) for some.)


#4

Is there any public info about the “tags as genres” work going on?

Tags as entities makes sense to me. Every permanent object should be an entity… Hell, as one of the original developers of the Drupal Relation module, we even made relationships entities, so that you could have relationships between relationships(!).

Yes, I tagged CallerNo6, because they mentioned the idea of a vote-based similarity metric. e.g. anyone can add a tag-tag relationship, but only when there are enough of a given relationship is it considered a synonym.

It might also be worth looking and StackExchange’s tag synonym process.


#5

@Bitmap, @alastairp, @reosarevok, or @CallerNo6? I’m not sure who has “the ball” right now.

Even if the 99+% of instances of “dnb” is for “drum and bass”, that doesn’t mean the one “Deutsche Nationalbibliothek” is as well. Also, “drum and bass” could be used for instrumentals that literally consists of someone playing drums and someone playing a bass (I imagine something jazzy, and/or maybe a hiphop/rap beat). Folksonomy tags are a hack for genres at best, claiming that any two are identical is as well. But I feel like I’m sidetracking from the actual discussion. C’est la vie.

Haha. Drupal 4 lyfe! (I’m also an old Drupal developer. :slight_smile: )


#6

Yeah… something else that I was thinking (that might be better as a separate discussion, but might not), is the idea of namespacing for folksonomy tags. E.g. you could have genre: drum and bass, and instrumentation: drum and bass. These might make sense as master tags, and then a vote-based similarity metric could be used to trigger an autocomplete. For example, the user types dnb, and the first suggestions include genre: drum and bass, and location: Deutsche Nationalbibliothek


#7

See https://bugs.launchpad.net/mixxx/+bug/1741147 for a discussion of synchronising tags from DJ software - this would be a mutually beneficial arrangement for Mixxx and MusicBrainz, and it would probably be a good idea to decide on some kind of namespacing standard before a million de facto standards emerge and then need to be translated.

I think a “:” as a general namespace divider is a very good idea (and fairly common across other software?).