|
Posted by Andy Hassall on 01/09/07 00:43
On Tue, 09 Jan 2007 01:14:28 +0100, Matthias Langbein
<matthias_langbein@web.de> wrote:
>when i convert a uploaded file to UTF-8 with the utf8_encode function,
>the string is prefixed by the two characters
>
>ÿþ
That appears to be a Unicode BOM:
http://unicode.org/unicode/faq/utf_bom.html#BOM
However, it's a UTF-16 big-endian Unicode BOM, and should presumably have been
converted into EF BB BF, the UTF-8 representation, or stripped off.
>The file is originally encoded as UTF-16. Can anybody tell me, why
>this happens?
The utf8_encode function can't convert from UTF-16; it only does ISO-8859-1.
Perhaps you need one of the mbstring functions, such as mb_convert_encoding.
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
[Back to original message]
|