ASCII originally used some code points for multiple purposes. The most well-known cases are the
' (“typewriter quotation mark” and “typewriter apostrophe”), but
- was also used both for the regular hyphen, various dashes and the minus sign.
Unicode “unsplit” those by making separate
‘’′code points and a number of dashes, too. It also introduced U+2010 as an unambiguous way to designate a hyphen (whereas a
- could be a legacy minus sign or dash). However, unlike the quotation/prime marks, and unlike the dashes, the hyphen and the hyphen-minus look identical, so interest in actually using U+2010 has remained rather low. Personally, I don’t think it makes much sense, either.
We should probably consider them as “quasi” canonically equivalent and convert all input to one of them consistently.
I had a look at the database: Currently, 558545 recording titles contain a hyphen-minus, and 4543 contain a U+2010 hyphen.