Correct hyphen: Unicode HYPHEN or HYPHEN-MINUS

I will let the “grownups” decide what is the proper character to use.
But I will always be using - and ’

In fact, I (and many others) will be limited to using `~!@#$%^&*()-_=+[{]}|;:’",<.>/?
because that is all the keyboard offers. I know that there are “combinations” I can use to make other things appear. I just won’t take the time to learn.

So, keep that in mind when having the discussion. Your ‘average’ person behind the keyboard is only going to know the keyboard characters.

4 Likes

And don’t forget the different meanings in different languages and different usages.
Just two examples:
English Hyphen-minus
German Bindestrich-Minus

Personally I don’t mind either way. If MB wants to use prettified characters that are hard for many to understand - then that is MB’s choice.

You asked for opinion - so here are my thoughts. I personally think the hyphen is a step too far down the pendent road. Yeah, technically correct, but awkward to work with outside of the MB screen.

It is difficult for me to even know when to use it. I haven’t done English Grammar since the 1980s so avoid trying to work out what is needed. Thanks to scripts I can put in the apostrophe’s and speech marks, but beyond that I start getting confused at the rules.

I don’t find anything “ambiguous” about the hyphen on my keyboard. I know it can be used in a multitude of places. What I find ambiguous are these new rules of where to use all the different types of dashs\hyphens\etc. I come here for music, not English lessons. :smiley:

The important thing for me is both my CD ripper EAC and tagger Picard can strip it out and swap it back to a keyboard based dash\hyphen.

These can be very annoying in filenames. Nothing more confusing that two folders side by side with different hyphens. Especially as they are hard to tell apart by eye.

It also messes up searches on anywhere that is not MB as not all search tools know to treat them the same in a search. I’m also back to not being able to type these things when I don’t have the Magic Scripts.

It is a headache when it starts to appear in my media library. I then have to find a way to treat it in there. Not all fonts have the ability to show Unicode characters so splodges and squares appear where there should be a neat little dash.

(It does make me laugh that this forum makes it impossible to really talk about ‐ or - as it sets all dashes to the same Unicode item. Or does it? Looks like it leaves those alone. I can’t tell - they confuse me)

So - yeah - I don’t like 'em in my filenames. But that is easily solved, so not really a problem to me.

But do remember that as @justcheckingitout points out, the average person has never heard of these. Or would even understand them. So please don’t be surprised when normal people like @Kid_Devine try and “correct” these oddities so they can use the data in their own applications. I know it has also popped up as a problem on the KODI media centre when Olivia Newton‐John was causing some confusion due to the unicode hyphen-dash.

Are they also fully documented to people who are using the MB API? Do they know that if they type Olivia Newton-John then they will not find her in a search?

4 Likes

IMO that should be fixed on that computer — it’s not properly the job of Musicbrainz to worry about what characters every font on every computer in the world may or may not have.

6 Likes

It’s not that difficult: :wink:

  • HYPHEN (trait d’union in French) is for linking words
  • MINUS is for mathematical operations
  • EN and EM DASH, they are contrary of linking words they are, like brackets, separators (EN is smaller than EM)
3 Likes

Is the U+2010 HYPHEN displaying correctly for you, if so, I wonder what font your browser is using? If it helps, I use the Font Finder extension to check what fonts are actually displayed. I think it’s available on most browsers.

1 Like

I think unicode hyphen should be used as much as possible, because it is more typographically correct. But there’s no way to enforce it (and that’s a good thing). (Loook i didn’t use ’ but the good’ol ').

7 Likes

I can’t speak for @jesus2099 but on mine (Chrome Windows 7 64-bit) it renders using Verdana, according to the Chrome “Inspect element” feature.

Update: I installed Bitstream Vera Sans, and Inspect suggests that the artist name element in the edit page still renders properly with Bitstream Vera Sans including the hyphen character U+2010, although on the actual artist page the name is rendered with Bitstream Vera Sans for 20 characters and Lucida Sans Unicode for one character (presumably the hyphen)

1 Like

Discourse prettifies them, anyway. :sweat_smile:

2 Likes

Note that the en dash is the grammatically correct character for date ranges, a fairly common use case of MusicBrainz:

Band: Greatest Hits 1970–1979

I imagine most editors just use the keyboard hyphen here.

4 Likes

What would be really useful to see is a short note in the Documentation of how this prettification can be removed in Picard for those who are tagging and using the data outside of the database. There is the Convert Unicode Punctuation characters to ASCII option, but it would be good to see a list of what it is actually swapping.

That should help reassure those who think this is madness.:wink:

I was also serious with my question about the API - how does that handle the fact you can have the inconsistencies of the dash? Would a search for “Easy-Star All Star” pick up all variations?

3 Likes

Isn’t the answer to this perennial and somewhat tedious argument to have MB decide.
If MB wants U+2018 or U+2019 to be used and it is given U+2027, let MB switch it.
Think of the thousands of edits you have seen where someone has spent time switching U+2027 to U+2019.
Same with hyphens - depending on context, let MB decide. Then we can all get on with something productive.

1 Like

That’s a nice idea in theory, but depending on the language and situation, ' would have to be replaced by ‘, ’ or even ‚ (which is not a comma :slight_smile:). Hyphen-minusses are even more problematic, even in English. To automatically choose the right -, – or — would be next to impossible.

4 Likes

A big stumbling point is the average user just can’t work out how to enter this stuff. I know when I first asked about it I couldn’t get a clear answer of which ALT+NUMPAD combo to type. It is only now I have some of @Jesus2099 scripts can I finally enter these.

Silly suggestions to make life easier - have a MENU available for noobs to enter these. U+2018 means nothing to the average computer user. Give them the keyboard combos and it is more likely to happen.

And a daft question, but why is it that the actual website does not follow these rules?
image
For a while when I was entering new tracks I spotted that apostrophe on the edit page and was copy\pasting that in… and then later realised it isn’t the same one. (I have to squint really hard to spot the differences so only the “Update the recording title to match the track title” question flagged up I was using the wrong thing.)

3 Likes

Most of the text on this website is very old, before it was decided to put typographically correct punctuation in the guidelines. I guess nobody bothered to update it. For what it’s worth, I used typographically correct punctuation in the Dutch translation, and other translations probably do the same. So that’s one place where the translations are ahead of the English website. :wink:

It would be very useful to have a box on edit pages with clickable punctuation that would insert the correct character at the current cursor location. I thought there was a ticket for it, but I can’t find it.

3 Likes

It is the main issue indeed with those characters.
I use some AutoHotKey macro on my keyboard (permanent, not only for MB).
Recently I also use a @Smeulf written awesome popup mb.unicodechars
user script with which you can find those fancy but useful characters by pressing Ctrl+m from within any MB editing text field.


I realise with shame that I have created this duplicate topic of Correct hyphen: Unicode HYPHEN or HYPHEN-MINUS.
If any @moderators could merge it inti that older topic… :bowing_man:

I think that probably shows you will never get a simple consensus on this issue. I know if I did a survey of my clients I would expect maybe two people out of three hundred would know what I was talking about.

What I like is that MB has this as a recommended but not compulsory action. This allows new users to add new data correctly. And then someone else can prettify if they want to. This should not be used to cause arguments as I have seen people driven away from MB by this. (That included me initially)

It is great when you spend your time soaked in Internet language and terminology. But this is a music database first so accuracy of data should always be first. A lot of people who know their music knowledge won’t understand this stuff.

5 Likes

I so agree, MusicBrainz cant expect most editors to know or care about the difference, and the priority shoud be making it easier to add data to MusicBrainz not harder. As an editor I would just use the characters easily available on the keyboard, but if MusicBrainz want to automatically change it I don’t mind as long as it is done automatically rather than any negative votes on my edits.

Two other things to consider

  1. Do those who insist on using the ‘correct’ character keep to this policy when doing non-mb things such as writing a letter
  2. As part of my Albunack project when trying to match artists to Disocgs artists I simplify the punctuation to find matches, this works well. Whereas insistence on using the correct characters probably prevents various MusicBrainz import scripts working as well as they could.
4 Likes

The search should treat all punctuations as non relevant, as mere separators.
If the search is fooled by HYPHEN, please open a bug ticket, IMO.

3 Likes
  1. I do.
  2. What Jesus said.

Also see my point of automating it being impossible above.

1 Like