You are here: Re: Html-encode all characters not in the current character set « PHP Programming Language « IT news, forums, messages
Re: Html-encode all characters not in the current character set

Posted by Willem Bogaerts on 04/27/07 07:24

> Is there a function that will allow me to
> output text written in utf-8 (from db for example)
> if my document has
>
> Content-Type: text/html; charset=ISO-8859-1
>
> I mean htmlspecialchars() and htmlentities() will only convert
> characters that have an associated entity defined in HTML.
> I would also like to translate all non-latin1 characters using
> numeric references.

There are two terms of interest here: "character set" and "encoding"

ISO-8859-1 is an encoding that only covers a limited character set. So
there is no euro sign, for example. The Bad thing about ISO-8859-1 is
that some programs silently replace it with cp-1252, which is similar
but not exactly the same (it does have a euro sign).


> &#355 is for a Romanian letter, ţ, for example, and letter ţ
> written in UTF-8 is not translated by htmlentities(), even if
> I give the function the optional character-set argument, 'UTF-8'
> (you can actually see the letter I typed if your system and your
> news reader understand and can display ISO latin 2 characters,
> encoded in utf-8).

So you want to encode characters that are NOT in the character set you
explicitly state. If you do want those characters, why do you state an
encoding that does not cover them? If you do want those characters, use
a character set that does have them (like unicode) and an encoding that
covers them (utf-8 is fairly common).

> I mean HTML documents can use characters in the entire UNICODE
> set, even if the document source is written in ASCII for example,
> by encoding any non-ASCII character with HTML entities.

Are you sure about that?

> Is there in PHP a function that will encode in HTML all non-ASCII
> characters, or all non-latin1 characters, or all characters not in the
> source character set ?

The htmlentities function does have an encoding parameter, but you have
already used that. As for numeric entities, I expect them to be
encoding-specific.

Best regards,
--
Willem Bogaerts

Application smith
Kratz B.V.
http://www.kratz.nl/

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация