|
Posted by Andy Dingley on 03/16/07 12:04
On 15 Mar, 18:45, grou...@reenie.org wrote:
> as I said before,http://reenie.org/test/unicode.htm
> is http://reenie.org/test/ascii.htmcleaned by tidy with utf
> encoding.
Ok, I think I understand what you've done now.
http://reenie.org/test/unicode.htm is broken. It appears to have a
ISO-8859-1 character in the file being served as a UTF-8 document.
Tidy didn't make this file. AFAIK, the Tidy you're using takes its
input from Firefox and doesn't have any "output to file" feature. You
must have taken its output from the clipboard, pasted it into your
choice of editor and saved it from there. At this point, I can only
assume that the file was a correctly-encoded ISO-8859 file.
The web server then gets to it and serves it up, with UTF-8 encoding
headers or embedded metas in it. Things go wrong _at_this_point_. File
is good (but not UTF-8), web document is bad (mis-labelled and thus
unreadable).
I suggest you try the "Tidy cleanup" process again, but this time make
sure that your editor's save setting is utf-8. jEdit is a well-behaved
editor here, some others (e.g. Eclipse) aren't. Watch out for Windows
editors, as they often say "Unicode" and mean UTF-16, which isn't
what's wanted at all. Look for a specific UTF-8 option.
Navigation:
[Reply to this message]
|