|  | Posted by Andy Hassall on 01/09/07 00:43 
On Tue, 09 Jan 2007 01:14:28 +0100, Matthias Langbein<matthias_langbein@web.de> wrote:
 
 >when i convert a uploaded file to UTF-8 with the utf8_encode function,
 >the string is prefixed by the two characters
 >
 >ÿþ
 
 That appears to be a Unicode BOM:
 http://unicode.org/unicode/faq/utf_bom.html#BOM
 
 However, it's a UTF-16 big-endian Unicode BOM, and should presumably have been
 converted into EF BB BF, the UTF-8 representation, or stripped off.
 
 >The file is originally encoded as UTF-16. Can anybody tell me, why
 >this happens?
 
 The utf8_encode function can't convert from UTF-16; it only does ISO-8859-1.
 Perhaps you need one of the mbstring functions, such as mb_convert_encoding.
 
 --
 Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
 http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
 [Back to original message] |