|
Posted by Jukka K. Korpela on 08/06/05 01:35
Toby Inkster <usenet200507@tobyinkster.co.uk> wrote:
> The advantage of UTF-8 (which is a concrete representation of the more
> abstract "Unicode" set of characters) is that it has vastly more
> characters than ISO-8859-1.
That is true, but the repertoire of characters that you can use in an UTF-8
encoded HTML document is exactly the same the one you can use in an ISO-
8859-1 encoded document, namely UCS, the Universal Character Set, also
known as the Unicode character set. The reason is that you can use
character references like 〹 to overcome the limitations of the
encoding.
UTF-8 becomes advantageous with respect to ISO-8859-1 if you use _many_
characters outside the ISO-8859-1 repertoire.
> The advantage of ISO-8859-1 is that is enjoys slightly wider support
> than UTF-8.
Besides, ISO-8859-1 is more compact for most West European languages:
it uses one octet per character, whereas UTF-8 uses two octets for any
character in the upper half of the ISO-8859-1 repertoire.
--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
Navigation:
[Reply to this message]
|