[STYLE-2498]: Korean or Hangul for script?

reosarevok · March 18, 2024, 12:44pm

Hi!

It was brought up in STYLE-2498 and on this edit that it is unclear what script should be used for Korean releases that only use Hangul (so, most Korean releases). I certainly don’t know enough to make a call, but we have plenty of editors for Korean music who will almost certainly have an opinion.

Right now, Korean is marked as a frequent script and it’s sort of expected to just pick that for Korean releases. The script documentation says “This covers any combination of Hangul and Hanja for Korean” and it’s basically the equivalent to the Japanese script, “This covers any combination of Kanji, Hiragana and Katakana for Japanese”.

For Japanese we specifically say to just use Japanese script even if a release uses Katakana, unless it’s a transliteration. That said, I understand that the issue with Japanese is that script combination is very common, while almost every Korean release (at least modern ones) are exclusively Hangul.

Should we suggest using Hangul for Korean releases unless they include any Hanja, and put Hangul rather than Korean in the frequently used scripts list?

jesus2099 · March 18, 2024, 2:58pm

I would say yes.

Are you sure?
Where do we say that?

But anyway, as the artist name is very likely to be in Kanji, for me the language and script is related to both the track titles and track artists, anyway, so Japanese would be frequently ok.

But for me きみがいるから is Japanese language in Hiragana script and メイクアップ is English language in Katakana script, not just Japanese script.

DontMindMe · March 18, 2024, 3:30pm

Yes from me as well. Making a ticket for this was on my todo list I never got around to. I can’t personally think of any examples I’ve added that had both Hangul and Hanja.

reosarevok · March 18, 2024, 3:40pm

In the script documentation I linked above But it is except for transliterations probably because that’s the only time they really use only or mostly Katakana?

jesus2099 · March 19, 2024, 1:02pm

I searched and edited some release scripts.
(Japanese, sorry off topic)

ms0010 · March 20, 2024, 9:59am

Regarding the Japanese language, what is the purpose of the “katakana,” “hiragana,” and “syllabaries” categories? As a Japanese person, subdividing the Japanese into these categories don’t think it makes sense, but is it useful for non-native speakers? (Perhaps native Korean speakers also feel that there is no point in creating a separate “Hangul”).

If it is meant to be used for transliteration, I can understand why it is used to distinguish between the Japanese translation of the original English title and the katakana reading of the original English title.

Example:
Original Latin Title: Ticket to Ride (by the Beatles)
Katakana: チケット・トゥ・ライド
Japanese: 涙の乗車券
Hiragana: なみだのじょうしゃけん

jesus2099 · March 20, 2024, 2:18pm

Mmmh no you’re completely right, IMO.
I don’t think anyone cares, either.
Good point, thank you!!

We can still handle only Japanese script:

Title	Language	Script
Ticket to Ride	English	Latin
チケット・トゥ・ライド	English	Japanese
涙の乗車券	Japanese	Japanese
なみだのじょうしゃけん	Japanese	Japanese
NAMIDANO JŌSHAKEN	Japanese	Latin

Indeed we have two identical
Japanese/Japanese rows, if we are dropping syllabaries.

But what would be the point of a pseudo-release Japanese transcription in hiragana, anyway?

Do we have any?

So if we extend that to Korean, is it really that useful to have both Korean (Hangeul plus Hanja) and Hangeul ?

ms0010 · March 21, 2024, 9:52am

There would be no necessity to register only in hiragana. The previous example was just a possible case to justify the existence of those “subcategories”. And reading guides such as furigana and ruby should be handled by Alias.

Depending on the actual usage of the MBdb and the compatibility with external application (if exist), it seems simplification would be better – “Japanese” only.

Maxr1998 · March 21, 2024, 4:28pm

I’d definitely be in favor. Most Korean releases I add are multiple languages/multiple scripts since there’s an equal mix of English/Latin and Korean/Hangul, but for those that are mostly Korean it’s nearly always Hangul only. Since it’s the more specific and more common script, I’d say it should be prominently suggested instead of Korean.

yindesu · March 21, 2024, 9:32pm

I am pretty sure the intended purpose of “Katakana” and “Hiragana” was solely for pseudo-releases (alternate tracklists) used for furigana, whereas “Japanese” was intended for everything else (including any combination of Latin script and Japanese characters).

Some people “won” the debate though and classify Japanese titles as “Multiple scripts” now. In a world where the release’s “Script” field can’t be used to figure out what kind of tracklist is being pulled down by your software, there are few reasons to keep around both “Korean” and “Hangul” as scripts, just like there are few reasons to keep around the trio of “Japanese”, “Hiragana”, and “Katakana”.

Likewise, there’s no reason to keep around “Han (Hanzi, Kanji, Hanja)” since the spirit of the Script field is being ignored.

jesus2099 · March 21, 2024, 9:47pm

IMO you should make them Korean, not Multiple.
Same as for Japanese and Latin should be Japanese.
Multiple is not very useful, IMO.

Maxr1998 · March 26, 2024, 5:16pm

For releases where it’s only a few English/Latin words, I usually did that. And when it’s the other way around where there’s only one or two Korean titles with everything else in English/Latin, that combination seems to fit the best. The complexity arises around 50/50 releases.

Most preferably, I’d love to specify multiple languages/scripts instead of the [Multiple ...] catch-all, just like it’s suggested in MBS-13200.
Until that’s implemented, I’ll use the more unique language/script from now on, like you suggested. The English/Latin can then be added later on.

chaban · June 7, 2024, 4:19pm

Has a conclusion been reached yet? It seems some editors have taken matters into their own hands:
https://musicbrainz.org/edit/112499735

Maxr1998 · June 7, 2024, 6:40pm

Wasn’t this discussion mostly about the primarily suggested script in the MB UI, and not about whether Hangul is generally correct or not?

For the linked edit, Hangul is indeed correct, and doesn’t seem to go against the guidelines either. The script documentation notably doesn’t even mention Hangul specifically, and thus doesn’t have an exception like the Katakana script. Hiragana is also still used normally.

So while Korean is of course correct for this release, Hangul should be the more explicit and thus better choice.

chaban · June 7, 2024, 7:47pm

No.

MusicBrainz Style / STYLE-2498
Component/s: Guidelines

Maxr1998 · June 7, 2024, 9:59pm

But where does it say that Hangul shouldn’t be used? I still don’t see how this edit is “editors taking matters into their own hands”. Your style ticket merely asks for guideline clarification on which should be used when, and why Korean is in frequently used instead of Hangul. This doesn’t make Hangul script usage on Hangul-only releases incorrect.

reosarevok · June 14, 2024, 12:01pm

I haven’t seen any big arguments why Hangul is worse, so I think it’s probably fine to use it for now. I’m busy with personal stuff at the moment but I probably should look into updating the frequent languages at some point when I’m more available again

yindesu · June 14, 2024, 1:00pm

Can you explain why Korean script should exist if modern Korean music should be using Hangul script?

chaban · June 14, 2024, 1:21pm

I meant to post this last week already but oh well…

That’s precisely the issue. Where does it say Korean (Kore) shouldn’t be used or that Hangul (Hang) should be preferred? Unlike Japanese and some other languages there is no guideline.

To properly use the release script field we would need to understand it first.

The script list in MBS seems based on ISO 15924 which also includes special codes such as:
Common (Zyyy) or Inherited (Zinh)

Those are for Unicode. So that bears the question what the purpose of release script is and how they are related:
https://www.unicode.org/reports/tr24/#Relation_To_ISO15924

In some cases the match between the Script property values and the ISO 15924 codes is not precise, because the goals are somewhat different. ISO 15924 is aimed primarily at the bibliographic identification of scripts; consequently, it occasionally identifies varieties of scripts that may be useful for book cataloging, but that are not considered distinct scripts in the Unicode Standard. For example, ISO 15924 has separate script codes for the Fraktur and Gaelic varieties of the Latin script.

Take mathematical notation (Zmth) or symbols (Zsym) for example.

https://en.wikipedia.org/wiki/Script_(Unicode)#List_of_scripts_in_Unicode

Not used are, among others, the ISO 15924 script codes: Zsym (Symbols) and Zmth (Mathematical notation). These are considered not to be scripts in Unicode sense.

These are present and used in MB. Emoji releases exist but no emoji script (Zsye)
Morse code (MBS-11876) releases exist but no script.

What’s MusicBrainz’ standard for script? It includes scripts for which Unicode has no script yet lacks many scripts for which ISO codes exist.
[multiple scripts] being an oddball as MBS uses the private code Qaaa (which is reserved and therefore not safe for use)

Zsye and Zsym also being special as can be used for controlling presentation of emoji

Let’s look at BCP 47:
https://scriptsource.org/cms/scripts/page.php?item_id=language_detail&key=kor

The paring of Korean (kor) to Korean script (Kore) seems quite clear there.
CSS Text Module points towards assuming Kore for Korean language.

However, BCP47 script subtags are not typically used (and are in fact discouraged) for languages strongly associated with a single writing system: instead that writing system is expected to be implied when no other is specified. [BCP47] IANA maintains a database of various languages’ most common writing system via the Suppress-Script field in its language subtag registry for this purpose.

https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

Type: language
Subtag: ko
Description: Korean
Added: 2005-10-16
Suppress-Script: Kore

Japanese language/script guidelines already seem to follow this practice where you normally use/assume Jpan script for Japanese language unless there is a need to use a different script such Latn for transliterated tracklists.

Analogously I’d expect Korean language tracklists would normally use Kore script.

Or maybe we should just get rid of the script field while at it:

(Note, I didn’t know anything about this before and just researched it so might be interpreting some things wrong)

yindesu · June 14, 2024, 1:23pm

As I’ve stated clearly in the past, I already no longer submit the Script field since so many voters don’t want to allow it to be useful from a software preference perspective.