Reply to Re: Encoding/characterset/font family confusion

Your name:

Reply:


Posted by Willem Bogaerts on 03/30/07 10:22

> A few day ago I discovered that the euro-sign is not defined in all
> fontfamilies.

This is a client issue - nothing you can do about. All you can do is
using HTML entities (€) so the browser knows what you mean (and
maybe switch fonts, depending on how intelligent the browser is)

> They cannot produce the right sign no matter if I use € or the
> hexadecimal equivalent.
> After a little research I found I could put font-tags around the euro-sign
> with another font-family (Arial in this case) to get the Euro sign.
>
> I am completely graphical impaired, and only understand programmingcode (and
> HTML/JavaScript of course) , so this is a weak point on my side, hence this
> question.
>
> I target on Europe only at the moment (no need for Chineese
> charactersupport)
> That said, will the following setup make sense?
>
> Postgresql db encoding scheme: LATIN1
> In the headers of all my HTML: content-type: text/html charset: iso-8859-1

Latin-1 does not include a euro sign at all. However, latin-1 is
sometimes replaced by enhanced encodings (like cp-1252 or Windows
encoding) and the euro sign does appear.

> A few related questions:
> 1) Will people be able to copy/paste info from other sources (like
> wordprocessing programs and other websites) into my forms?

In short: yes. It is up to the browser to convert the encoding to the
one used by the OS. I never had any trouble with it.

> 2) Can I use regular expressions as I am used to (ASCII) in my PHP code?
> Will I match e acute, eurosign, etc?

Yes. All latin-1 characters are just one byte. No problem.

> 3) Will the roundtrip describe here under have problems with normal expected
> european characters?
>
> client copies some text from some source ->
> paste in the form ->
> receive by PHP ->
> insert in Postgresql (or update) ->
> retrieve from postgresql ->
> display as HTML (with content-type: text/html charset: iso-8859-1)
>
> Is that OK?
> Any pitfalls?
> Should I maybe use UTF-8?

I switched to using utf-8 a few months ago, and I still have trouble
with it. For some vague reason, so can set all encoding startup
variables to utf-8, and connections are STILL made with latin-1 unless
you specifically use the SET NAMES command. Someone wrote an article
"utf-8, love at fifth site". That is so true! It can do a lot, but it is
a real hell to configure all systems to use it. Furthermore, the
implementations are all non-encoding-aware. The problem is that a text
always has an encoding, while a string does not. And texts are treated
as strings, so with every string operation, you will have to make sure
that the correct encoding is used.

>
> Any pointers are hugely appriciated because, to me, this is all quite
> confusing.

Here are some links:
http://www.phpwact.org/php/i18n/charsets
http://www.gravitonic.com/downloads/talks/intlphpcon2005/php_unicode.pdf

Best regards
--
Willem Bogaerts

Application smith
Kratz B.V.
http://www.kratz.nl/

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация