|
Posted by Andy Hassall on 02/03/06 00:00
On Thu, 02 Feb 2006 21:35:22 GMT, junk@junk.com wrote:
>Sorry if this has been asked before, and apologise if this is the
>wrong NG.
>
>I am using PHP 5.0.5 and Apache 2.0.54 in a Win2k environment.
>
>Lately I have been playng with RSS feeds. I managed to get "lastRSS"
>which is a simple RSS parser.
>
>When I tried to setup an RSS feed to eBay to get custom searches
>straight to my desktop I noticed that the UK Pound sterling symbol is
>shown preceded by a Latin capital A with circumflex. (An 'A' wearning
>a hat).
>
>I checked the RSS feed and the extra char is not there.
>
>So, I am unsure how to progress to sort this out. I don't know if PHP
>or apache is the problem. I can only find one other comment on Google
>where someone is having the same problem. But still no answer.
>
>I checked the changelogs for the lastest versions of PHP and Apache
>and there is no mention of this bug.
>
>Is it just me?
>
>Any clues will be much appreciated.
First thing to consider is the encoding - what encoding is the RSS feed in? As
it's XML, the most common encoding is UTF-8.
What did you check the RSS feed with? If you used a browser or a half decent
editor it would most likely have understood the encoding and presented the
character correctly.
But your PHP code may be trying to treat UTF-8 as single-byte ISO-8859-1.
A British pound symbol is two bytes in UTF-8 - it's U+00A3 which is 0xC2 0xA3
in UTF-8.
http://www.fileformat.info/info/unicode/char/00A3/index.htm
If you tried to display this as ISO-8859-1 you'd get:
0xC2 = Latin capital A with circumflex
0xA3 = British pound symbol
http://en.wikipedia.org/wiki/ISO_8859-1
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Navigation:
[Reply to this message]
|