|
Posted by Toby Inkster on 09/26/89 11:37
Super Mango wrote:
> 1) all invisible chars (like ALT+0254) will be changed to spaces
> 2) all sequence of several whitespaces will be changed to a single
> space.
> 3) leading and trailing spaces will be canceled (I know how to do that
> with trim, so this is only nice-to-have)
Unless you want to jump through hoops, you'll probably want not one, but
two regular expressions here. The first expression takes care of your
first two requests; the second does the trimming.
// Replace multiple white-space character with a single space.
$out = preg_replace('/\s+/', ' ', $in);
// Trim leading and trailing space.
$out = preg_replace('/(^ )|( $)/', '', $out);
Now, "\s" in a preg_replace only matches the common whitespace characters
(space, formfeed, newline, carriage return, horizontal tab, and vertical
tab).
Unicode does have plenty of other space-like characters, so you may want
to explicity add them to the first expression, like:
/[\s\x00A0\x200B\xFEFF]+/
and so on. A good list of Unicode space-like characters can be found here:
http://www.cs.tut.fi/~jkorpela/chars/spaces.html
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Navigation:
[Reply to this message]
|