|
Posted by Leo Andrews on 11/11/27 11:24
gene.ellis@gmail.com wrote:
> Put simply, I have a text box, and people commonly cut + paste
> information into this text box from Microsoft word. The problem is that
> word has all types of funky characters (smart quotes, em-dashes), that
> the system (php-based) doesn't understand. Does anyone know of a way to
> filter out these Microsoft-specific characters? Does PHP have a special
> function for this? Thanks a lot!
>
Hooray I can actually be of use to this group for once. Yes, if you look
in the user notes on php.net for the htmlentities function you will see
an entry from mail at britlinks dot com (19-May-2004 05:27). I've listed
it below for reference. Mind you I'm sure the hardcore programmers on
this group will be able to formulate a one-line regexp for this and we
look forward to seeing it.
In the meantime, I hope this helps.
<?php
// strips slashes, and converts special characters to HTML equivalents
for string defined in $var
function htmlfriendly($var,$nl2br = false){
$chars = array(
128 => '€',
130 => '‚',
131 => 'ƒ',
132 => '„',
133 => '…',
134 => '†',
135 => '‡',
136 => 'ˆ',
137 => '‰',
138 => 'Š',
139 => '‹',
140 => 'Œ',
142 => 'Ž',
145 => '‘',
146 => '’',
147 => '“',
148 => '”',
149 => '•',
150 => '–',
151 => '—',
152 => '˜',
153 => '™',
154 => 'š',
155 => '›',
156 => 'œ',
158 => 'ž',
159 => 'Ÿ');
$var = str_replace(array_map('chr', array_keys($chars)), $chars,
htmlentities(stripslashes($var)));
if($nl2br){
return nl2br($var);
} else {
return $var;
}
}
?>
Navigation:
[Reply to this message]
|