|
Posted by Simon Harris on 02/10/07 11:14
In this case, I want to remove them - I tried your suggestion, but it still
left & in the string in my test (Using the function below).
function html2txt($document){
$document = str_replace("<li>"," <li>",$document);
$search = array('@<script[^>]*?>.*?</script>@si', // Strip out javascript
'@<[\\/\\!]*?[^<>]*?>@si', // Strip out HTML tags
'#&\w+;#iU', // Strip out HTML entities such as
'@<style[^>]*?>.*?</style>@siU', // Strip style tags
properly
'@<![\\s\\S]*?--[ \\t\\n\\r]*>@' // Strip multi-line
comments including CDATA
);
$text = preg_replace($search, '', $document);
return $text;
}
Just out of curiosity, how would you decode them? Thinking about it, this
might actually work better for me.
Regards,
Simon.
"Michael Fesser" <netizen@gmx.de> wrote in message
news:190qs2hinu8sup8f0hj86jtiu92g8cople@4ax.com...
> .oO(Simon Harris)
>
>>I am trying to replace HTML entities (Such as etc) with nothing.
>
> Do you really want to remove them or decode them?
>
>>Heres what I have so far, which isnt working...'@&[^;]@siU',
>
> '#&\w+;#iU'
>
> Micha
--------------------------------------------------------------------------------
I am using the free version of SPAMfighter for private users.
It has removed 3778 spam emails to date.
Paying users do not have this message in their emails.
Try SPAMfighter for free now!
Navigation:
[Reply to this message]
|