|
Posted by Jukka K. Korpela on 09/11/07 09:47
Scripsit shror:
> when I was testing the Arabic side the <form> wasn't sent correctly
> when I entered Arabic text.
> What I got was nothing except Unicode letters like this:
> بامب
As usual, revealing a URL would have helped to analyze the problem. But it
looks pretty obvious that the problem is in the character encoding of the
form data.
For example, if a page is ASCII (or ISO-8859-1) encoded, then the form data
encoding is the same by default, and if you enter an Arabic character in a
form there, the effect is _undefined_ by HTML specifications. What browsers
might do is to represent the characters that have no representation in the
encoding by character references like ب (or by entity references, when
applicable). This is really odd, since the form data is just character data,
not HTML, but on the other hand, what else could a poor browser do?
You could tweak your form handler into dealing with such references, but the
real solution is to make the page UTF-8 encoded and to make the form handler
deal with UTF-8 data.
--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/
Navigation:
[Reply to this message]
|