You are here: Re: Regex to get the <html></html> « PHP Programming Language « IT news, forums, messages
Re: Regex to get the <html></html>

Posted by FFMG on 08/02/07 17:39

Rik;84869 Wrote:
> On Thu, 02 Aug 2007 17:48:24 +0200, FFMG
> <FFMG.2up4zm@no-mx.httppoint.com>
> wrote:
> > I want to get the <head> code and a 'simple?' solution seems to be
> > be...
> >
> > preg_match_all("/<[html]+[^>]*>\s*(.*\s*)<\/html>\s*/i", $html,
> > $matches, PREG_SET_ORDER);
>
> Euhm, nope. you start on an undefined tag (lose the blockquotes around
>
> '[html]'), and you;re matching the html tag, not the head tag.
>

Of course, thanks. Must have been a typo.

Rik;84869 Wrote:
>
>
> DOM functions? <http://nl3.php.net/dom>
>
> > How can I change my regex to ignore head tags inside double or
> single
> > quotes?
>
> Could be done by setting a greedy match starting on a quote untill the
>
> endquote. Then again, if you're concerned with invalid attributes,
> you'd
> have to allow for the possibility the quotes are erronous too, i.e.
> someone forgot to open or close them.
>
> I've taken a stab at it with regexes in the past, which works quite
> well
> as long as you can be sure it's stricly valid HTML. If it isn't, or
> you're
> using outside sources where this isn't known, don't use regular
> expressions for something a parser ought to be doing.
> --
> Rik Wasmus

Thanks, are you suggesting that I walk the text, first look for the
open tag, then look for the close tag that is not within a quote?

I guess a simple function could do that.

Would you know of such function or would I need to write one :)?

FFMG


--

'webmaster forum' (http://www.httppoint.com) | 'Free Blogs'
(http://www.journalhome.com/) | 'webmaster Directory'
(http://www.webhostshunter.com/)
'Recreation Vehicle insurance'
(http://www.insurance-owl.com/other/car_rec.php) | 'Free URL
redirection service' (http://urlkick.com/)
------------------------------------------------------------------------
FFMG's Profile: http://www.httppoint.com/member.php?userid=580
View this thread: http://www.httppoint.com/showthread.php?t=19012

Message Posted via the webmaster forum http://www.httppoint.com, (Ad revenue sharing).

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация