More UTF-8 additions? (ordinal number suffixes)

Came across this edit today, https://musicbrainz.org/edit/37857544

Is this desired? I don’t see anything about it in the style guide.

For what it’s worth, the fancy string in that edit (“19ᵗʰ”) is handled sensibly by:

  • Lucene (which powers MB’s search server)
  • The ICU collating tools, and
  • Python’s unicodedata.normalize() (so presumably Picard can ascii-ify it).

So I personally don’t see any harm in it.

2 Likes

Depends who you ask, I guess. It is obviously desired by the editor who did it, or they wouldn’t have done it. :wink:

From Unicode’s plaintext philosophy, it is not desirable because things like superscripts are presentational and should be dealt with out-of-band; just like there are no “bold letters” in Unicode. The only reason Unicode contains some of these as their own characters is for compatibility with legacy encodings.

4 Likes

As far as I know, the superscript isn’t generally preferred, but is a matter of preference. If that’s the case (I could be wrong), it should only be in the recording title if that’s the most common representation of the recording. If it is, I don’t see any problem with it.

For the interested: Date and time notation in the United Kingdom and in the United States

1 Like

I’m pretty sure those are wrong.

According to what I can find those are from Unicode blocks intended for phonetic scripts intended basically for pronunciation guides. Using them as a way to fake a small superscript ‘t’ and ‘h’ is a massive hack and probably should not be done that way.


5 Likes

I’d only agree with this if it was clearly artist intent - however if it was artist intent, then I’d have no issue what-so-ever about how it was accomplished, unicode-wize.

I’ve reverted this edit and removed the use of these characters in a couple of release titles.

2 Likes