Reply to Re: locale fr_FR.utf8 and str_word_count() — PHP Programming Language

Posted by Kimmo Laine on 08/30/06 12:25

"Peter M�nster" <look@signature.invalid> wrote in message
news:Pine.LNX.4.64.0608301400400.871@gaston.deltadore.bzh...
> On Wed, 30 Aug 2006, Kimmo Laine wrote:
>
>> That might be a multibyte-string related problem. If the string is
>> encoded
>> using multibyte charset, such as utf-8, it could be the reason
>> str_word_count is confused.
>
> Yes, you're right: I've just tried with fr_FR.iso885915 and it works.

That's great. :)

>> Once you've installed multibyte library, you could try writing a regular
>> expression for counting the words and use it with the mb_ereg* functions.
>
> Thanks for the hint. As a workaround I use already a regular expression to
> get the words, but str_word_count() is still better than my solution:
> str_word_count() detects constructs like "it's" and "week-end" etc.

There were some examples of regexp substitutions for str_word_count in the
php.net manualpage, in the user contributions. You might want to check them.

For example rcATinterfacesDOTfr suggests that

$word_count = count(preg_split('/\W+/', $text, -1,
PREG_SPLIT_NO_EMPTY));

should work. The advantage in this solution is that there is mb_eregi_split
as well, wo you could use this with the mb-functions if you wanted to use
utf-8.

I try to enforce utf-8 whenever it is possible simply because of it's
advantages in an international multilingual communication even thou it has
it's disadvantages as well.

--
"Ohjelmoija on organismi joka muuttaa kofeiinia koodiksi" - lpk
http://outolempi.net/ahdistus/ - Satunnaisesti p�ivittyv� nettisarjis
spam@outolempi.net || Gedoon-S @ IRCnet || rot13(xvzzb@bhgbyrzcv.arg)

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация