|
Posted by Toby A Inkster on 02/19/07 12:47
A.Martini wrote:
> the source is iso-8859-1 and is converted with utf8_encode before
> xml_parse
I'm pretty sure you're on the right track with it being a character
encoding issue.
However, if the source string only contains *entities* (e.g. “)
rather than true non-ASCII characters (e.g. “), then utf8_encode shouldn't
actually do anything with the source. (It doesn't need to, as ASCII is a
proper subset of both ISO-8859-1 and UTF-8.)
So it is likely that either:
1. xml_parse is actually converting “ => ?
OR
2. xml_parse is correctly converting “ => “
but whatever you do with the result isn't Unicode-aware.
Number 2 is likely, as (until PHP 6 comes out) most native PHP functions
don't handle Unicode very well.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux
* = I'm getting there!
Navigation:
[Reply to this message]
|