|
Posted by Toby A Inkster on 05/16/07 08:15
Karl C. wrote:
> I'dd like to know which is the prefered code to display the umlaut mark, is
> it in decimal code ë or the entity ë?
>
> I have one page using iso-8859-1 where I just use the normal character 'ë'
> but the other site uses utf-8 which asks for unicode.
If you're using ISO-8859-1 or UTF-8, then you can just type ë straight
into the file -- no need to reference it in any special way. This is
because both of those encodings include the ë character.
You only need to use an entity or character reference such as ë or
ë when you're working in an encoding that doesn't include ë. Examples
of such encodings are US-ASCII and Shift-JIS.
For example, say you're working on an HTML file in US-ASCII encoding.
US-ASCII is a fairly old character set with support for only about 100
printable characters. In particular, it doesn't include any characters
with diacritic marks (a.k.a. "accents") So because you can't represent ë
directly in the file, you can use one of HTML's methods of representing
that character:
ë
ë
ë
That way, the file is still valid US-ASCII, as you've not directly
included the non-US-ASCII character ë -- you've only included an ampersand
(&) and a few other characters, all of which are valid US-ASCII characters.
But an HTML User-Agent, which "mentally converts" all the files it reads
into Unicode, will know to read the entity as ë.
With regard to which you should use, it doesn't really matter except in
some exceptional circumstances.
Circumstance 1: Hexadecimal character references (ones beginning with
"&#x") tend to have slightly poorer support in some very old browsers, so
if you need to support those, then stick to the mnemonic entities (ë)
and decimal character references (ë).
Circumstance 2: XML only has five mnemonic character entities -- "&",
">", "<", """ and "'". Others can be defined, but should
not be relied on as they require the processing agent to read the DTD to
understand what they are. Many agents do not read the DTD (formally, they
don't have to), so will not understand the entities. For this reason, it's
wise to stick to only using numeric character references in XML, except for
"&", ">", "<" and """. (I leave out "'" because
Internet Explorer doesn't support it -- use "'" instead.) As XHTML is
a variety of XML, this advice applies to XHTML too.
--
Toby A Inkster BSc (Hons) ARCS
http://tobyinkster.co.uk/
Geek of ~ HTML/SQL/Perl/PHP/Python/Apache/Linux
Navigation:
[Reply to this message]
|