| 
	
 | 
 Posted by Umberto Salsi on 07/22/05 06:31 
lkrubner@geocities.com wrote: 
 
> I output everything from my site as UTF-8. I'd like to check the input 
> for characters that are not UTF-8 and then turn the bad ones to an 
> ASCII question mark. [...] 
 
This function simply drop all the illegal sequences and return a legal 
UTF-8 string: 
 
        function Force_UTF_8($s) 
        { 
                return mb_convert_encoding($s, 'UTF-8', 'UTF-8'); 
        } 
 
For example: 
 
        Force_UTF_8("A\x00B\xc0\x80C") ==> "A\000BC" 
 
Note that the control characters (here \000) aren't removed. 
 
Should be enought for many cases. You may send a warning to the user if 
the resulting string differ from the original one. 
 
Regards, 
 ___  
/_|_\  Umberto Salsi 
\/_\/  www.icosaedro.it
 
  
Navigation:
[Reply to this message] 
 |