|
Posted by Umberto Salsi on 05/23/07 15:48
Nick Wedd <nick@maproom.co.uk> wrote:
> I am running PHP version 5.2.0-8+etch3.
>
> If I run mb_convert_encoding( $string, "UTF-8", "HTML-ENTITIES")
> on a string containing "Е", it ought to produce the two bytes
> whose decimal values are 208 149. But it produces the four bytes whose
> decimal values are 242 175 184 159.
>
> Is there a fix for this? Does a later version of PHP get it right, or
> do I have to write my own conversion function?
>
> Nick
> --
> Nick Wedd nick@maproom.co.uk
Indeed, it is a bug: mbstring lacks support for numeric hex entities,
they are decoded as if were decimals, so 'x' becomes the "decimal digit"
'x'-'0'=120-48=72, from which the strange value you obtained. I already
sent a short patch to internalsATlists.php.net, hoping be the right place.
PS. If you are playing with such a conversion, chances are that your HTML
pages are not properly encoded. Sure you need to do that?
Best regards,
___
/_|_\ Umberto Salsi
\/_\/ www.icosaedro.it
Navigation:
[Reply to this message]
|