|
Posted by Toby A Inkster on 08/03/07 10:20
FFMG wrote:
> I want to get the <head> code and a 'simple?' solution seems to be
> be...
There is no simple solution. In HTML, the start and end tags for the
<head> element are *optional* -- in other words, the following valid
document is considered to have a head element containing one TITLE and
one META element:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<title>Foobar</title>
<meta name=Keywords content="Foo,Bar,Baz,Foobar">
<h1>Foobar</h1>
<p>Foo bar baz.</p>
Your regular expression will not find the <head> element, which *is*
there, even if you can't explicitly see the beginning and end!
Best to use PHP's DOM stuff, as Rik mentioned.
--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 43 days, 13:49.]
Command Line Interfaces, Again
http://tobyinkster.co.uk/blog/2007/08/02/command-line-again/
Navigation:
[Reply to this message]
|