Reply to Re: Odd character display / UTF issue ? — HTML

Posted by Andy Dingley on 10/15/07 10:18

On 15 Oct, 05:50, "Jukka K. Korpela" <jkorp...@cs.tut.fi> wrote:
> Sounds like character encoding confusion. Anything that _looks_ like "? " is
> probably something UTF-8 encoded (or distorted UTF-8) interpreted by some
> 8-bit encoding.

No, characters in a UTF-8 encoding interpreted by a tool using non-
UTF-8 encoding will generally generate garbage characters that are
still displayable (the tool thinks that it received two good
characters, they just don't mean anything). Typically it's a pair of
characters, the first of these is some variant of an accented
"A" (they won't all be, but if you see lots of spurious "A"s on a
page, look to UTF-8).

To get the unrecognizable character "?" displayed, then your tool must
have been able to automatically recognise garbage, i.e. bad encodings,
not just bad characters. This usually indicates non UTF-8 characters
being served as UTF-8, then the tool being unable to process them as
UTF-8. As ASCII is also simultaneously UTF-8 and ISO-8859-*, this is
caused (most likely) by non-ASCII characters with ISO-8859-* encodings
and a UTF-8 content-type.

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация