|
Posted by Gιrard Talbot on 04/08/07 07:52
Simply Confusing! wrote :
> Hi
>
> i'm looking for a simple answer to what could be a complex question.
>
> i'll try to make my question digestible.
>
> i've done some web-pages in chinese. i pretty much ALWAYS work in unicode
> sequences, meaning, I convert the word doc's with chinese char's into html,
> then transplant the UNICODE SEQUENCES (ie, characters represented with stuff
> like this: 樣的東 ... etc ) into my templates.
>
> somewhere I was told that for chinese, you use "big5" (traditional) and
> "gb1312" (simplified)
You most likely meant to say gb2312 here, not gb1312.
> for the charset attrib's on the Content-type metatag.
You unfortunately need more than that. The web server should serve the
document as big5 or gb2312 with the correct charset. Sometimes the web
server could be misconfigured. You may have to ask your webserver admin
(in my case, I had) so that - if you're lucky - the Apache server can be
tuned accordingly to serve your document as big5 or gb2312.
Content Negotiation (for Apache servers)
http://httpd.apache.org/docs/1.3/content-negotiation.html
One way I remembered on working around the problem (until the admin of
the web server would fix the problem) was to create an .htaccess file
and then editing in it the character set with
AddCharset GB-2312 .html
AddCharset directive in Apache servers
http://httpd.apache.org/docs/1.3/mod/mod_mime.html#addcharset
FAQ: Setting charset information in .htaccess
http://www.w3.org/International/questions/qa-htaccess-charset
Setting the HTTP charset parameter
http://www.w3.org/International/O-HTTP-charset.en.php
> This I did, but occasionally, the browser would display ascii-gibberish, and
> occasionally weird things would happen between where I'd download the
> gibberish containing file, and my unicode sequences had actually been
> replaced by ascii-gibberish. odd.
>
> so then I reverted to using the iso-8859-1 charset attrib, and everything
> settled down. no problem. I use the lang-tags zh-tw and zh-cn to ID my
> pages as tradtional or simplified. (yes, i know that does not relate to
> char display).
>
You need here what is called the http headers response for your webpages
so that you can know for sure how is your webpage served. From the
symptoms you describe, I would bet this is what is happening: your
webserver is not configured to deal, to serve your webpage with the
correct/intended character set.
View HTTP Request and Response Header
http://web-sniffer.net/
Most developer tools/toolbar have a http headers feature.
E.g.:
LiveHTTPHeaders
http://livehttpheaders.mozdev.org/
You can even have a bookmarklet for that:
Jesse Ruderman Validation Bookmarklets
http://www.squarefree.com/bookmarklets/validation.html
More and more browsers now provide such feature too or view info panel
on how the document was served. For Opera 9:
Opera W3-Dev Menu
http://tobyinkster.co.uk/opera
W3-dev > More Page tests > HTTP Headers
> so i recently found a chinese language site and checked out the source code.
> it was puzzling because the charset was utf-8 and the source was actually in
> original chinese characters, not unicode.
>
> i'm quite puzzled now. my chinese pages are displaying fine with unicode
> under iso-8859-1, but I'm not sure what the "definitive" way is to display
> non-latin character sequences. is there one?
99% chances - I'd bet - are that your web server is misconfigured and
can not handle sending your webpage as big5 or gb2312.
> i'd be particularly interested in hearing from asians who design asian
> sites;
On-line Chinese Tools
http://projects.ldc.upenn.edu/Chinese/info_it.htm
Penn State lab courses on computing in foreign scripts:
Tips for Developing Non-English Web Sites
http://tlt.its.psu.edu/suggestions/international/
Penn State lab courses on computing in foreign scripts: Chinese
(Simplified & Traditional)
http://tlt.its.psu.edu/suggestions/international/bylanguage/chinese.html
> also from western coders who have successfully developed chinese
> language sites, or other non-latin language sites (russian, hebrew, arabic,
> etc...)
Help Chinese translation page
http://www.gtalbot.org/DHTMLSection/HelpChineseTranslationPage.html
I have done webpages in Chinese, Russian, Hebrew, Arabic, etc, in over
20 languages, even Inuktitut.
Site Map
http://www.gtalbot.org/Varia/SiteMap.html
GΓ©rard
--
Using Web Standards in your Web Pages (Updated Dec. 2006)
http://developer.mozilla.org/en/docs/Using_Web_Standards_in_your_Web_Pages
[Back to original message]
|