You are here: Re: Cyrillic on the web « HTML « IT news, forums, messages
Re: Cyrillic on the web

Posted by Jukka K. Korpela on 06/02/06 05:59

Alan J. Flavell <flavell@physics.gla.ac.uk> scripsit:

> Yes, there seem to be three bytes there: d4 aa f8. I can't help
> worrying that they started life as a utf-8 BOM (ef bb bf), and have
> been mapped through whatever misguided encoding coversion has
> scrambled the rest of the content.

Well spotted.

> Oh yes, A.Prilop is going to love this!! That's exactly what happens
> when one passes ef bb bf through Mr. Pirard's old Mac -> iso-8859-1
> conversion table from 1992.

Sounds quite plausible under the circumstances.

> Hmmm yes, if I take the first 6 bytes of the document title: ad fc 8b
> c4 ad bd, and run them back through Pirard's table, I get d0 9f d1 80
> d0 b8 , which is the utf-8 representation of the three Cyrillic
> letters for "Pri" (I'm not going to try to put cyrillic letters into
> this posting!). Going on a bit further, I make it out to be
> "Privetst...", does that make some kind of sense?

Surely, it's the start of a Russian word that means 'greeting'. (Of course,
using such words in a document title is waste of precious real estate, but I
digress.)

> However, I think I'd prefer to start again from fresh materials!!

Me too. And using UTF-8 for Russian isn't particularly efficient. Using e.g.
windows-1251, you have one octet (byte) for each character. Using UTF-8, you
have one octet for each character in the Ascii range (including characters
used in HTML markup) but two octets for each Cyrillic letter. UTF-8 would be
fine if the document contained, say, a mixture of Russian and French.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация