You are here: Re: UTF-8 html entity decoding « PHP Programming Language « IT news, forums, messages
Re: UTF-8 html entity decoding

Posted by Darko on 11/02/07 20:42

On Nov 2, 9:02 pm, Jake <off...@gmail.com> wrote:
> I have a string that has UTF-8 characters encoded using html
> entities. For example the string "é 字" is being encoded as "&#233;
> &#23383;". I have no control over how this string is given to me, so
> I need to figure out a way to decode "&#233; &#23383;" back into "é
> 字".
>
> I have already tried urldecode, html_entity_decode, utf8_decode and
> convert_uudecode without success. My server environment is limited to
> the latest version of PHP 4, so I cant use any PHP 5 stuff.
>
> Anyone have suggestions?

Here's the sample from php.net's page about utf8_encode (http://
www.php.net/manual/en/function.utf8-encode.php), thanks to certain
luka8088:

function html_to_utf8 ($data)
{
return preg_replace("/\\&\\#([0-9]{3,10})\\;/e", '_html_to_utf8("\
\1")', $data);
}

function _html_to_utf8 ($data)
{
if ($data > 127)
{
$i = 5;
while (($i--) > 0)
{
if ($data != ($a = $data % ($p = pow(64, $i))))
{
$ret = chr(base_convert(str_pad(str_repeat(1, $i + 1),
8, "0"), 2, 10) + (($data - $a) / $p));
for ($i; $i > 0; $i--)
$ret .= chr(128 + ((($data % pow(64, $i)) - ($data
% ($p = pow(64, $i - 1)))) / $p));
break;
}
}
} else
$ret = "&#$data;";
return $ret;
}

Example:
echo html_to_utf8("a b &#269; &#263; &#382; &#12371; &#12395; &#12385;
&#12431; ()[]{}!#$?* &lt; &#62;");

Output:
a b č ć ž こ に ち わ ()[]{}!#$?* &lt; &#62;

Cheers

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация