You are here: Re: persian languages charset, and what DOCTYPE? « HTML « IT news, forums, messages
Re: persian languages charset, and what DOCTYPE?

Posted by Alan J. Flavell on 10/12/71 11:44

On Sat, 8 Apr 2006, Harlan Messinger wrote:

> else, and the appearance in two places of "تست2", once after the
> date at the top, and once as the first item in list of Recent Posts.
> The first one appears in the page source as "تست2"

Yes, I'd spotted that, and noted that if interpreted as utf-8 it turns
out as Arabic-script characters, which made it seem as if that part
had been inserted into it incorrectly.

> and the second appears as
> "تست2", the character entity
> representation of the same thing.

Blimey, so it does! I hadn't spotted that at first look. So it's
worse than just broken!!

Furthermore, I now see loads of hrefs like these:

http://journalhome.com/razavi/21877/%26Oslash%3B%26ordf%3B%26Oslash%3B%26sup3%3B%26Oslash%3B%26ordf%3B2.html

*Shudder*

For what it's worth - coming back to the تست2 which we saw, if I
convert[1] that from utf-8 to us-ascii encoding then the result reads:

تست2

which can be decoded e.g with my trusty decoding ring (;-) at
http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata06.html


At this kind of third-hand remove from the original complainant, and
with me only understanding the theory of the character representation,
without being able to read Farsi - nor have the slightest inclination
to tangle with the mess that comes out of MS's attempts to extrude
something resembling HTML, I'm afraid I can't go much further than to
say that these pages seem to be dreadfully broken; it's a wonder that
anything comes out as intended.

good luck (you-all will need it!)

[1] by "convert" I mean, in Seamonkey (nee Mozilla), manually set
View> Encoding to utf-8, then File> Edit Page, then in Composer,
"Save and change character encoding". Unfortunately it doesn't
offer us-ascii as an option, but any 8-bit encoding which doesn't
cover Arabic would suffice for this purpose - e.g Armenian, Thai,
whatever you like. (Perhaps we should ask the Mozilla folks to
support saving in us-ascii explicitly?).

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация