Posted by Jim Michaels on 02/17/06 03:23
"Jim Michaels" <jmichae3@nospam.yahoo.com> wrote in message
news:nZqdnTRe1_XSvmjeRVn-jQ@comcast.com...
>
> "Super Mango" <liebermann@gmail.com> wrote in message
> news:1137567117.302565.44500@g14g2000cwa.googlegroups.com...
>> Hi,
>>
>> I want to change a given string in a way that:
>> 1) all invisible chars (like ALT+0254) will be changed to spaces
>> 2) all sequence of several whitespaces will be changed to a single
>> space.
>> 3) leading and trailing spaces will be canceled (I know how to do that
>> with trim, so this is only nice-to-have)
>>
>> I cant limit the string to a-zA-Z0-9 because that every visible char is
>> OK (it's a multi lingual site).
>
> you have a problem. I don't see how a regex is going to work with UNICODE
> or UTF-8 data. it can only handle chars up to 255 (\xff).
correction. it does handle them \x{fe23}. but now you have to know where
the character sets are mapped in the font(s) and the ranges.
<?php print preg_match("/\x{ffff}/","\xff"); ?>
^Z
0
>
>>
>> Any idea?
>>
>> Thanks in advance!
>>
>
>
[Back to original message]
|