Please add the Chinese language to the page of the website

I understand that the W3C standard still accepts zh for legacy and backward compatibility reasons. Major modern browsers support cmn because they support the web standard, as well as zh for backward compatibility. Since MB is developing its software and the website has yet to have a Chinese version, there’s no backward compatibility issues to consider. But this is not a big deal and it’s for the developers to decide.

yes,zh-cn and zh-tw
These are the only two

This has always been a hack, though, as traditional Han is not specific to Taiwan.

Don’t we have something that looks like Hant and Hans somewhere in MB already?
Maybe in the release script, certainly.

I agree. If we are to change zh-Hans to zh-TW, then we must support other regional variants of zh. Take this example: Traditional Chinese is the predominant Chinese script used in Taiwan and Hong Kong. 監製 in Hong Kong is producer, but in Taiwan, it’s executive producer. Executive producer in Hong Kong is 執行監製. Producer in Taiwan is 製作人. If we only support one regional variant of Chinese, we are inducing editors to enter the wrong relationships.

5 Likes

I have only seen these two on I18N’s mature website, and I have shown simplified and traditional Chinese here. These are the only two.

When you say other variants, that means spoken words, not written characters.
In fact, there are only two written characters: “简体中文”and“繁体中文”

Classifying Chinese is always tricky…

It’s not just “for legacy and backward compatibility”, zh is the de facto standard tag for written Chinese, I’ve never seen cnm for anything other than audio. You say, correctly, that zh is for the Chinese macrolanguage, but the fact is there is no tag for written standard Chinese, which isn’t exactly Mandarin — zh is the de facto standard tag for written Chinese. Browsers support cmn only in the sense that it is a valid language tag, but they aren’t expecting it (the same is true for all other kinds of software). You can see the link above as an example, if your browser locale is China it requests zh-CN, if it is Taiwan it requests zh-TW, both fall back to zh, and then to en, not to cmn. That means that if your browser is in Chinese, and the there is no page in zh, only cmn, you will get it in English, not Chinese. An obvious reason for this is that Cantonese native speakers also write in standard written Chinese zh, but would never say they write Mandarin cmn. For the written language that in Chinese is called 中文, the de facto standard tag is zh.

For the script it makes sense, but the issue here is language. We both write in the Latin script, but probably neither of us speaks Latin…

zh-Hant and zh-Hans also makes sense when the content is exactly the same and only the characters change (I mean, the same characters in different variants), but the way it is set up now you can have different translations in the two variants. This is also common for professional translations, you send the zh-TW to a translator in Taiwan and the zh-CN to a translator in China. And even if it was the same content, you still have the issue that zh-Hant and zh-Hans aren’t common, and the browser won’t expect it.

We already do in the sense that new languages can be requested, but it’s not likely that you will have people to support all these variants, few projects do, and we don’t even have enough translated on the two Chinese translations we have now to actually use them.

This is a good point, but we can’t have a whole new translation because one word is different in one region. Even if you just copy and just change one word from the existing translation, it would require somebody reviewing every new string to be sure it doesn’t include one of these words, which is not realistic. But I actually wasn’t aware of this difference (and I should be), do you know any other such cases in MB’s terminology? Either way, with or without additional localizations, it’s important to have this information in the Wiki and at least mention it on the (very lacking) Chinese guidelines.

4 Likes

I’m wondering why choose zh-TW for Chinese traditional Han.
Seems strange to use a country code for a script.

Did Hongkongers stop using Traditional Han since 1997?
Hong Kong is (was? I mostly know pre-97) a big music producer with traditional Han printed on every releases.
And Singapore, don’t they use Traditional Han, as well (this I know very little)?
Or is it only in Taiwan, now?

Yes, Traditional Chinese is still used in Hong Kong. Traditional Chinese is the official script in Taiwan (ROC), Hong Kong and Macau. Simplified Chinese is the official script in China (PRC), Singapore and Malaysia.

I think it still is, though probably not as much as pre-97, and definitely the main base for music in Cantonese. I would suspect not as much for international releases, but I don’t know this for a fact.

It’s not really using a country code for a script, it’s both things. Language tags are generally language-REGION, so that’s what most systems will expect. Also, there are always minor differences in vocabulary and character choice, even in regions that use the same script (e.g. Hong Kong/Taiwan, China/Singapore), so it still makes sense to indicate the region. I think we can think about it a little like English spelling in the UK and US; there are differences in spelling, but we don’t mark that in the language tag, you just know that in en-US you are going to spell centre as "center”, and “colour” as "color”, and in zh-TW you are going to write 音乐 as “音樂” and 艺术家 as “藝術家”. Of course, the spelling “colour” isn’t just used in the UK, it’s probably used in all Commonwealth countries, and “藝術家” isn’t just used in Taiwan, it’s used in all regions where traditional characters are used.

5 Likes