|  | Posted by Stian Berger on 06/13/31 11:07 
Hi!strip_tags() would not solve his problem, although that was my first  =
 
 thought as well.
 To skip tags, including content, where content contains certain words is=
 =
 
 possible.
 But to me the problem occurs with nested tags. What do you want to do wh=
 en  =
 
 you meet tables?
 
 Here is an example that solves you're example, and similar situations, b=
 ut  =
 
 not much else.
 
 preg_match_all("/<(?!body|script|etc)(\w+)[^>]*>((?>(?!eee|etc|<\/
 \\1>).)*)<\/\\1>/s",$text,$match);
 
 print_r($match[2]);
 
 will return
 [0] =3D> aaa jjjj mmmm dddd yyyy ssss
 [1] =3D> aaa hhh mmmm dddd yyyy ssss
 [2] =3D> aaa kkkk mmmm dddd yyyy ssss
 
 (?!body|script|etc) is used to filter unwanted tags, and in  =
 
 (?!eee|etc|<\/\\1>) you can put your filter words.
 
 Hope this helps you anyway.
 
 --
 
 Stian
 
 On Wed, 2 Feb 2005 11:36:26 +0100, Mirco Blitz <webmaster@lindworm.de>  =
 
 wrote:
 
 > Hi,
 > Use strip_tags() instead of regex.
 >
 > http://www.php-center.de/en-html-manual/function.strip-tags.html
 >
 > Greetings
 > Mirco
 >
 > -----Urspr=FCngliche Nachricht-----
 > Von: php [mailto:silviumaghear@programming-pool.com]
 > Gesendet: Mittwoch, 2. Februar 2005 09:25
 > An: php-general@lists.php.net
 > Betreff: [PHP] regular expresion
 >
 > I want to parse a html file
 > for instance
 >
 > <body>
 > <p>aaa jjjj mmmm dddd yyyy ssss</p>
 > <b>aaa hhh mmmm dddd yyyy ssss</b>
 > <p>aaa eee mmmm dddd yyyy ssss</p>
 > <i>aaa kkkk mmmm dddd yyyy ssss</i>
 > </body>
 >
 > and I want to create a regular expresion wich is able to extract entir=
 e  =
 
 > text
 > from enclosed tags WITHOUT a particular word
 > for example               eee
 > final I want to obtain this result
 >
 > aaa jjjj mmmm dddd yyyy ssss
 > aaa hhh mmmm dddd yyyy ssss
 > aaa kkkk mmmm dddd yyyy ssss
 >
 > Any solution?
 >
 >
 > thank you
 >
 > Silviu
 >
 > --
 > PHP General Mailing List (http://www.php.net/) To unsubscribe, visit:
 > http://www.php.net/unsub.php
  Navigation: [Reply to this message] |