|
Posted by e.ahlback on 07/05/06 10:41
trihanhcie@gmail.com wrote:
> Hi,
>
> I would like to extract the text in an HTML file
> For the moment, I'm trying to get all text between <td> and </td>. I
> used a regular expression because i don't know the "format between
> <td> and </td>
>
> It can be :
> <td> text1 </td>
> or
> <td>
> text1
> </td>
> or anything else
>
> eregi("<td(.*)>(.*)(</td>?)",$text,$regtext);
>
> The problem is that, if I have
> <td> text</td>
> <td>text2</td>
>
> regtext will return text</td><td>text2.
>
> How can I change the expression so that it stops at the first occurence
> of </td>?
>
> Thanks
Hi.
Not sure, but I think this is what you want.
http://fi.php.net/manual/en/ref.dom.php
These function should be able to extract the text from any tags!
Sorry if I'm wrong.
Navigation:
[Reply to this message]
|