|
Posted by Stian Berger on 10/01/31 11:07
Hi!
strip_tags() would not solve his problem, although that was my first =
thought as well.
To skip tags, including content, where content contains certain words is=
=
possible.
But to me the problem occurs with nested tags. What do you want to do wh=
en =
you meet tables?
Here is an example that solves you're example, and similar situations, b=
ut =
not much else.
preg_match_all("/<(?!body|script|etc)(\w+)[^>]*>((?>(?!eee|etc|<\/
\\1>).)*)<\/\\1>/s",$text,$match);
print_r($match[2]);
will return
[0] =3D> aaa jjjj mmmm dddd yyyy ssss
[1] =3D> aaa hhh mmmm dddd yyyy ssss
[2] =3D> aaa kkkk mmmm dddd yyyy ssss
(?!body|script|etc) is used to filter unwanted tags, and in =
(?!eee|etc|<\/\\1>) you can put your filter words.
Hope this helps you anyway.
--
Stian
On Wed, 2 Feb 2005 11:36:26 +0100, Mirco Blitz <webmaster@lindworm.de> =
wrote:
> Hi,
> Use strip_tags() instead of regex.
>
> http://www.php-center.de/en-html-manual/function.strip-tags.html
>
> Greetings
> Mirco
>
> -----Urspr=FCngliche Nachricht-----
> Von: php [mailto:silviumaghear@programming-pool.com]
> Gesendet: Mittwoch, 2. Februar 2005 09:25
> An: php-general@lists.php.net
> Betreff: [PHP] regular expresion
>
> I want to parse a html file
> for instance
>
> <body>
> <p>aaa jjjj mmmm dddd yyyy ssss</p>
> <b>aaa hhh mmmm dddd yyyy ssss</b>
> <p>aaa eee mmmm dddd yyyy ssss</p>
> <i>aaa kkkk mmmm dddd yyyy ssss</i>
> </body>
>
> and I want to create a regular expresion wich is able to extract entir=
e =
> text
> from enclosed tags WITHOUT a particular word
> for example eee
> final I want to obtain this result
>
> aaa jjjj mmmm dddd yyyy ssss
> aaa hhh mmmm dddd yyyy ssss
> aaa kkkk mmmm dddd yyyy ssss
>
> Any solution?
>
>
> thank you
>
> Silviu
>
> --
> PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
> http://www.php.net/unsub.php
Navigation:
[Reply to this message]
|