Re: PHP4 : Extract text from HTML file — All PHP

You are here: Re: PHP4 : Extract text from HTML file « All PHP « IT news, forums, messages

Posted by trihanhcie on 07/05/06 12:28

Hi

Thanks again for your help.

I m trying a different method to extract all text in a HTML file. If
you think it's a bad idea, tell me :)
I want to have all text include between '>' and '<', in the body.
However, I think there's a mistake again in my regular expression...

preg_match_all('|>(.*)(\n)*(\r*)<|i',$text,$matches)

I want to recognise a text like
<a href = ...> link </a>
<table>
<tr><td>
line1
line2
line3
</td></tr>

So i tried to add the end of line caracter but it looks like it doesn't
work :s Anyone can help?

Thanks

trihanhcie@gmail.com wrote:
> Thanks :) I'm a beginner in regular expression and it is not so easy :D
>
> I'm still trying ^^
>
>
>
> Rik wrote:
> > trihanhcie@gmail.com wrote:
> > > It can be :
> > > <td> text1 </td>
> > > or
> > > <td>
> > > text1
> > > </td>
> > > or anything else
> > >
> > > eregi("<td(.*)>(.*)(</td>?)",$text,$regtext);
> > ---------------------------^
> > This doesn't do what you think it does
> >
> > > The problem is that, if I have
> > > <td> text</td>
> > > <td>text2</td>
> > >
> > > regtext will return text</td><td>text2.
> > >
> > > How can I change the expression so that it stops at the first
> > > occurence of </td>?
> >
> > An asterisk (*) can made non-greedy (i.e. capturing untill the next match is
> > true) by placing a question mark after it.
> >
> > preg_match_all('|<td[^>]*>(.*?)</td>|i',$text,$matches);
> >
> > Grtz,
> > --
> > Rik Wasmus

Navigation:

Next in forum: freeze table headers?
Prev in forum: Re: Better understanding of PRIMARY, UNIQUE, INDEX...
Thread view: Re: PHP4 : Extract text from HTML file

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация