|
Posted by Peter Münster on 08/30/06 12:08
On Wed, 30 Aug 2006, Kimmo Laine wrote:
> That might be a multibyte-string related problem. If the string is encoded
> using multibyte charset, such as utf-8, it could be the reason
> str_word_count is confused.
Yes, you're right: I've just tried with fr_FR.iso885915 and it works.
> Once you've installed multibyte library, you could try writing a regular
> expression for counting the words and use it with the mb_ereg* functions.
Thanks for the hint. As a workaround I use already a regular expression to
get the words, but str_word_count() is still better than my solution:
str_word_count() detects constructs like "it's" and "week-end" etc.
Is multi-byte support planned for str_word_count() ?
Cheers, Peter
--
email: pmrb at free.fr
http://pmrb.free.fr/contact/
[Back to original message]
|