Reply to Re: PHP4 : Extract text from HTML file — PHP Programming Language

Posted by David Gillen on 07/06/06 10:50

An noise sounding like trihanhcie@gmail.com said:
> Hi,
>
> I would like to extract the text in an HTML file
> For the moment, I'm trying to get all text between <td> and </td>. I
> used a regular expression because i don't know the "format between
><td> and </td>
>
> It can be :
><td> text1 </td>
> or
><td>
> text1
></td>
> or anything else
>
> eregi("<td(.*)>(.*)(</td>?)",$text,$regtext);
>
> The problem is that, if I have
><td> text</td>
><td>text2</td>
>
> regtext will return text</td><td>text2.
>
> How can I change the expression so that it stops at the first occurence
> of </td>?
>
Greedy regex, the .* is matching as much as possible. To make it non-greedy
put in .*? the ? in this instance modified the behaviour of the .* so it will
try and match the smallest amount possible within the constraints of the
overall regex.

Regards,
D.
--

/(bb|[^b]{2})/
Trees with square roots don't have very natural logs.
What's the difference between ignorance and apathy? Who knows? Who cares?

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация