Posted by Umberto Salsi on 07/22/05 06:31
lkrubner@geocities.com wrote:
> I output everything from my site as UTF-8. I'd like to check the input
> for characters that are not UTF-8 and then turn the bad ones to an
> ASCII question mark. [...]
This function simply drop all the illegal sequences and return a legal
UTF-8 string:
function Force_UTF_8($s)
{
return mb_convert_encoding($s, 'UTF-8', 'UTF-8');
}
For example:
Force_UTF_8("A\x00B\xc0\x80C") ==> "A\000BC"
Note that the control characters (here \000) aren't removed.
Should be enought for many cases. You may send a warning to the user if
the resulting string differ from the original one.
Regards,
___
/_|_\ Umberto Salsi
\/_\/ www.icosaedro.it
[Back to original message]
|