You are here: Re: UTF8: file_put_contents doesn't seem to write UTF8 content properly « PHP Programming Language « IT news, forums, messages
Re: UTF8: file_put_contents doesn't seem to write UTF8 content properly

Posted by amygdala on 06/14/07 01:39

"Andy Hassall" <andy@andyh.co.uk> schreef in bericht
news:nbn073pqso001bsrhpkmpcqvnjoulkjjlb@4ax.com...
> On Wed, 13 Jun 2007 22:25:44 +0200, "amygdala" <noreply@noreply.com>
> wrote:
>
>>I'm trying to let PHP write a 'sitemap.xml' sitemap for Google and other
>>searchengines. It's working, except that the content in the XML file
>>doesn't
>>seem to be UTF8. (Which it should be, judging by the information given on
>>Google's webmaster helpcenter).
>>
>>The way I test to see if the content is UTF8, is by opening the XML file
>>in
>>notepad and choose 'save as...'. Normally the coding option should be set
>>to
>>UTF8, but now it just shows ANSI.
>
> Well, that's not a foolproof method...


I was afraid of that.


>>This is what I have tried to write UTF8 content with:
>>
>>file_put_contents( '.' . SITEMAP_FILE, utf8_encode(
>>$this->sitemapForCrawlers ) );
>>...and...
>>file_put_contents( '.' . SITEMAP_FILE, iconv( "ISO-8859-1", "UTF8",
>>$this->sitemapForCrawlers ) );
>>
>>...where...
>>SITEMAP_FILE is the filename constant
>>...and...
>>$this->sitemapForCrawlers is the string with XML data
>>
>>With the last attempt I even got an error saying:
>>
>>Wrong charset, conversion from `ISO-8859-1' to `UTF8' is not allowed in...
>>
>>Any adeas of how I can make this work?
>
> Start from the beginning; what character set encoding is the original data
> in?
> The error implies that it's not ISO-8859-1 (which does have some gaps
> where
> characters aren't valid...)

Well... I discovered the 'Set Code Page...' option in UltraEdit, the main
editor I use to code PHP. And it tells me my PHP code files are encoded in
'1252 (ANSI - Latin I)'. So, now my next question is... what would be the
correct first parameter for the iconv function to tell it that the original
data is '1252 (ANSI - Latin I)'. I've tried numerous stings, which include:

'1252 (ANSI - Latin I)'
'1252'
'1252 ANSI'
'1252-ANSI'
'ANSI-1252'
'ANSI 1252'

....and variations.

Is there any iconv encoding table with acceptable encodings I can consult?
Also, isn't '1252 (ANSI - Latin I)' just a pimped version of ISO-8859-1?

Although I'm still curious of this. Please read my reply to C. also.

Thanks.

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация