Reply to Re: Odd character display / UTF issue ? — HTML

Posted by Jukka K. Korpela on 10/15/07 14:34

Scripsit Andy Dingley:

> On 15 Oct, 05:50, "Jukka K. Korpela" <jkorp...@cs.tut.fi> wrote:
>> Sounds like character encoding confusion. Anything that _looks_ like
>> "? " is probably something UTF-8 encoded (or distorted UTF-8)
>> interpreted by some 8-bit encoding.
>
> No, characters in a UTF-8 encoding interpreted by a tool using non-
> UTF-8 encoding will generally generate garbage characters that are
> still displayable

That's what I wrote about, using the (iso-8859-1 encoded) character Â
(letter A with circumflex accent) as in the original question. I wonder what
piece of software munged it, but it wasn't anything I was using.

> (the tool thinks that it received two good
> characters, they just don't mean anything).

Two, three or four.

> Typically it's a pair of
> characters, the first of these is some variant of an accented
> "A"

Yes, at least when the 8-bit encoding is ISO-8859-1.

The combination "Â " also indicates some other error, since the octet
combination C2 20 must not appear in UTF-8 encoded data. We have little way
of knowing what happened, but I'd guess that 20 (which looks like space when
interpreted according to ISO-8859-1) was some octet in the range 80..9F,
maybe something that isn't allocated in windows-1252.

> To get the unrecognizable character "?" displayed,

Which unrecognizable "?"? The question mark is recognizable, and so is the
character "Â", which is what was actually included in the original question.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация