|
Posted by Jukka K. Korpela on 05/27/06 16:39
ironcorona <iron.corona@gmail.com> scripsit:
> Luigi Donatello Asero wrote:
>
>> Well, so far I have been able to display Russian and Chinese in
>> UTF-16.
>
> How come you're using UTF-16?
I think Luigi Asero has refused to understand the principles of character
encoding. That might explain part of the phenomenon.
> Russian and Chinese can both be encoded
> in UTF-8.
Undoubtedly. Pretty much anything that can be expressed as written text in
computer-readable form can be encoded in UTF-8. More exactly, all Unicode
text can be encoded in UTF-8.
> Though I would like to ask; how many Chinese symbols are there?
A few myriads. The exact number depends on your ontology of symbols. (Does a
symbol exist if it is known from one single written document only? What
about two?)
> Can you encode them *all* in UTF-8?
No, because not all Chinese symbols have (yet) been included into Unicode.
Theoretically, you could encode them yourself, using Private Use code
points, which naturally have UTF-8 encoding, too, but that's hardly a
feasible solution in HTML authoring.
> Also [and wildly off topic] how do
> you make up new [written] words in Chinese?
I think it's really wildly off-topic, and a good book on Chinese writing
systems might help. The answer also depends on your definition of "word".
--
Yucca, http://www.cs.tut.fi/~jkorpela/
[Back to original message]
|