You are here: Re: breaking up an UTF8 string « All PHP « IT news, forums, messages
Re: breaking up an UTF8 string

Posted by Markus on 07/20/07 06:50

Mathias K. schrieb:
> Hello!
>
> Could anyone give me a hint how to split up an utf8 string character-wise
> with PHP?
>
> In my case it's a Japanese string. As the lengths of each Japanese
> character can differ from 2 to 6 bytes i don't know how to find out where
> one character begins and the other ends.
>
> I tried splitting it up with mb_split:
>
> $departed = implode('<br>', mb_split("\w", $word));
>
> Well it doesn't seem to work. The Japanese character totally get messed up.
>
> Does anyone have a clue what regex to use or how else i could split a
> Japanese string character wise?

I am not too familiar with Japanese and mbstring - of course you made
sure proper encodings are set? See mb_internal_encoding(),
mb_regex_encoding().

Also, I think that mb_split() removes the delimiter, which is a word
character - should it not rather be mb_split("", $word)?

Thinking of alternative methods, you can try something like:
$chars = array();
for ($i=0; $i<mb_strlen($word); $i++) {
$chars[] = mb_substr($word, $i, 1);
}
implode('<br>', $chars);

Finally, if there is a problem with the mbstring functions, you can try
the PEAR I18N_UnicodeString class:
http://pear.php.net/package/I18N_UnicodeString

It is very handy for converting a UTF-8 string into an array of the
decimal Unicode representations:

require_once('I18N_UnicodeString.php');
$numbers = I18N_UnicodeString::utf8ToUnicode($word);
$chars = array();
foreach ($numbers as $nr) {
$chars[] = I18N_UnicodeString::unicodeCharToUtf8($nr);
}
implode('<br>', $chars);

(All examples are not tested.)

HTH
Markus

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация