Reply to UTF-8 Headache --

Your name:

Reply:


Posted by James on 01/10/06 23:57

I have a function that (by fluke or whatever) used to work perfectly
and seems to have changed behaviour on me. The function was meant to
take a string and convert it from have characters with diacritics to
there non-diacritic equivalent. For example Dürer would become Durer
-- except all of a sudden its becoming DA?rer. This is a problem :)
The function and some sample HTML are below -- any clues or hints would
be appreciated. I do see my extended character represented by the two
-- I understand what has kinda happened I just dont know how to deal
with it ...

<?php

function kill_diacritic ($word_string) {

global $dbtype;

if (empty($word_string)) {
return $word_string;
}
else {
$string_length = strlen($word_string);


for ($x=0;$x<$string_length;$x++) {
$ascii = ord(substr($word_string,$x,1));

switch($ascii){

case 224: // à
case 225: // á
case 226: // â
case 227: // ã
case 228: // ä
case 229: // å
$tmp = "a";
break;

case 231: // ç
$tmp = "c";
break;

case 232: // è
case 233: // é
case 234: // ê
case 235: // ë
$tmp = "e";
break;

case 236: // ì
case 237: // í
case 238: // î
case 239: // ï
$tmp = "i";
break;

case 241: // ñ
$tmp = "n";
break;

case 240: // ð
case 242: // ò
case 243: // ó
case 244: // ô
case 245: // õ
case 246: // ö
case 248: // ø
$tmp = "o";
break;

case 154: // š
$tmp = "s";
break;

case 249: // ù
case 251: // û
case 252: // ü
$tmp = "u";
break;

case 253: // ý
$tmp = "y";
break;

case 158: // ž
$tmp = "z";
break;

case 192: // À
case 193: // Á
case 194: // Â
case 195: // Ã
case 196: // Ä
case 197: // Å
$tmp = "A";
break;

/*
// Oracle represents Æ as a ?. Not sure what MySQL will
// Do with this character. Pretty sure nobody will ever
// search using it but its there regardless.

case 198: // Æ
$tmp = "?";
break;
*/

case 200: // È
case 201: // É
case 202: // Ê
case 203: // Ë
$tmp = "E";
break;

case 208: // Ð
$tmp = "D";
break;

case 204: // Ì
case 205: // Í
case 206: // Î
case 207: // Ï
$tmp = "I";
break;

case 209: // Ñ
$tmp = "N";
break;

case 210: // Ò
case 211: // Ó
case 212: // Ô
case 213: // Õ
case 214: // Ö
case 216: // Ø
$tmp = "O";
break;

case 138: // Š
$tmp = "S";
break;

case 217: // Ù
case 218: // Ú
case 219: // Û
case 220: // Ü
$tmp = "U";
break;

case 159: // Ÿ
case 221: // Ý
$tmp = "Y";
break;
} // switch

if (!empty($tmp) or $tmp=="_") {
$word_string = str_replace(chr($ascii),$tmp,$word_string);
$tmp="";
}
} // for
}

return $word_string;
}
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">


<html>

<head>
<title>UTF8 Testing</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel='stylesheet' href='styles/default/stylesheet.css'
type='text/css'>
<script type="text/JavaScript" src="javascript.js"></script>

</head>

<body>

<form method="GET" action="index.php">
<input type="text" name="s" size="10" value="<?php echo
$_GET['s']; ?>">
<input type="submit" value="Search">
</form>

<p>
<?php echo $_GET['s']; ?>
</p>

<p>
<?php echo kill_diacritic($_GET['s']); ?>
</p>

</body>

</html>

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация