|
Posted by Erwin Moller on 03/30/07 11:00
Willem Bogaerts wrote:
>> A few day ago I discovered that the euro-sign is not defined in all
>> fontfamilies.
>
> This is a client issue - nothing you can do about. All you can do is
> using HTML entities (€) so the browser knows what you mean (and
> maybe switch fonts, depending on how intelligent the browser is)
>
>> They cannot produce the right sign no matter if I use € or the
>> hexadecimal equivalent.
>> After a little research I found I could put font-tags around the
>> euro-sign with another font-family (Arial in this case) to get the Euro
>> sign.
>>
>> I am completely graphical impaired, and only understand programmingcode
>> (and HTML/JavaScript of course) , so this is a weak point on my side,
>> hence this question.
>>
>> I target on Europe only at the moment (no need for Chineese
>> charactersupport)
>> That said, will the following setup make sense?
>>
>> Postgresql db encoding scheme: LATIN1
>> In the headers of all my HTML: content-type: text/html charset:
>> iso-8859-1
>
> Latin-1 does not include a euro sign at all. However, latin-1 is
> sometimes replaced by enhanced encodings (like cp-1252 or Windows
> encoding) and the euro sign does appear.
>
>> A few related questions:
>> 1) Will people be able to copy/paste info from other sources (like
>> wordprocessing programs and other websites) into my forms?
>
> In short: yes. It is up to the browser to convert the encoding to the
> one used by the OS. I never had any trouble with it.
>
>> 2) Can I use regular expressions as I am used to (ASCII) in my PHP code?
>> Will I match e acute, eurosign, etc?
>
> Yes. All latin-1 characters are just one byte. No problem.
>
>> 3) Will the roundtrip describe here under have problems with normal
>> expected european characters?
>>
>> client copies some text from some source ->
>> paste in the form ->
>> receive by PHP ->
>> insert in Postgresql (or update) ->
>> retrieve from postgresql ->
>> display as HTML (with content-type: text/html charset: iso-8859-1)
>>
>> Is that OK?
>> Any pitfalls?
>> Should I maybe use UTF-8?
>
> I switched to using utf-8 a few months ago, and I still have trouble
> with it. For some vague reason, so can set all encoding startup
> variables to utf-8, and connections are STILL made with latin-1 unless
> you specifically use the SET NAMES command. Someone wrote an article
> "utf-8, love at fifth site". That is so true! It can do a lot, but it is
> a real hell to configure all systems to use it. Furthermore, the
> implementations are all non-encoding-aware. The problem is that a text
> always has an encoding, while a string does not. And texts are treated
> as strings, so with every string operation, you will have to make sure
> that the correct encoding is used.
>
>>
>> Any pointers are hugely appriciated because, to me, this is all quite
>> confusing.
>
> Here are some links:
> http://www.phpwact.org/php/i18n/charsets
> http://www.gravitonic.com/downloads/talks/intlphpcon2005/php_unicode.pdf
>
> Best regards
Thank you Willem.
Excactly the kind of info I needed to read.
I like the link to www.joelonsoftware.com/articles/Unicode.html
He describes a type of programmer that excactly fits myself: the one trying
to ignore issues with charactersets. :-)
[quote]
So I have an announcement to make: if you are a programmer working in 2003
and you don't know the basics of characters, character sets, encodings, and
Unicode, and I catch you, I'm going to punish you by making you peel onions
for 6 months in a submarine. I swear I will.
And one more thing: IT'S NOT THAT HARD.
In this article I'll fill you in on exactly what every working programmer
should know. All that stuff about "plain text = ascii = characters are 8
bits" is not only wrong, it's hopelessly wrong, and if you're still
programming that way, you're not much better than a medical doctor who
doesn't believe in germs. Please do not write another line of code until
you finish reading this article.
[/quote]
I think I follow his advise (treat). ;-)
Time to grow up/read up.
Thanks.
Regards,
Erwin Moller
Navigation:
[Reply to this message]
|