Picard help - getting ™ into a track name

I have this release in Picard for tagging:

But Picard has swapped all the ™ to TM in the tags. Why is this?

From experimenting I find that it was being done by “Convert Unicode punctuation characters to ASCII” - but ™ isn’t punctuation.

(I have this option on as I prefer ASCII versions of …, -, " and ’ in my files)

Is there a list anywhere of these kinds of substitutions? What other characters are beings swapped? Can the Help file be updated to note the non-punctuation substitutions?

I think that option simply changes every character that is not in ASCII (anything not in this table) into its closest ASCII equivalent (think of things like ® becoming (R)).

If you only want a couple of characters changed you could use the $replace function in Picard.

That is what I initially thought, and then realised it leaves Japanese and other non-ASCII text alone. It “says” punctuation, so I assumed there is a list it works from.

I have thought about tweaking one of the plugins to do the changes I need. The ', " and - changes are the main ones I focus on as these cause havoc in searches and sorting when inconsistently scattered in releases. the dash is especialy confusing in file names for me as I can’t visibly see the difference. :smiley:

From the looks of it the option applies a specific type of Unicode Normalization (NFKC), which not only replaces fancy punctuation symbols with their “simpler” alternatives but also applies the same logic to certain “composite” symbols, like the trademark symbol.

See https://www.unicode.org/charts/normalization/ and search for “2122” to see the rule concerning the trademark symbol.

4 Likes

Thanks @elomatreb - that makes some sense. Hope someone who writes the manual sees this and can add that link and explain which chart they are using. :slight_smile: