checking to see if a character is UTF8 — PHP Programming Language

You are here: checking to see if a character is UTF8 « PHP Programming Language « IT news, forums, messages

Posted by lkrubner on 11/18/05 00:15

this is a function that someone has up on www.php.net:

function seemsUTF8($Str) {
// bmorel at ssi dot fr
//17-Feb-2004 01:22
//Here is an improved version of that function, compatible with 31-bit
encoding scheme of //Unicode //3.x :
for ($i=0; $i < strlen($Str); $i++) {
if (ord($Str[$i]) < 0x80) continue; # 0bbbbbbb
elseif ((ord($Str[$i]) & 0xE0) == 0xC0) $n=1; # 110bbbbb
elseif ((ord($Str[$i]) & 0xF0) == 0xE0) $n=2; # 1110bbbb
elseif ((ord($Str[$i]) & 0xF8) == 0xF0) $n=3; # 11110bbb
elseif ((ord($Str[$i]) & 0xFC) == 0xF8) $n=4; # 111110bb
elseif ((ord($Str[$i]) & 0xFE) == 0xFC) $n=5; # 1111110b
else return false; # Does not match any model
for ($j=0; $j < $n; $j++) {
# n bytes matching 10bbbbbb follow ?
if ((++$i == strlen($Str)) || ((ord($Str[$i]) & 0xC0) != 0x80))
return false;
}
}
return true;
}

What is achieved by the variable $n? I don't know enough about
character codes to understand what that final inner for loop is trying
to do.

Navigation:

Next in forum: Re: urlencode and $_GET
Prev in forum: Re: OScommerce PHP help.
Thread view: checking to see if a character is UTF8

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация