Re: Extracting body from HTML document? — PHP Programming Language

You are here: Re: Extracting body from HTML document? « PHP Programming Language « IT news, forums, messages

Posted by Andre-John Mas on 11/14/07 21:47

On Nov 14, 4:23 pm, Andre-John Mas <andrejohn....@gmail.com> wrote:
> Hi,
>
> I am wanting to be able to get a section of a HTML document, by
> specifying an XPath. For example:
>
> $title= GetSection ( '/html/head/title');
> $body= GetSection ( '/html/body');
>
> I made a simple parser myself some time back, but it is failing with
> certain types of documents. Instead of maintaining the code, I would
> reather find an existing solution, so that I can concentrate my
> development efforts elswhere. Does anyone have anything they can
> recommend?
>
> Andre

My current implementation is very basic. The main issue I am having is
that if there are any attributes associated with the start element,
then nothing is returned. While I can eventually solve this, I would
rather use a robust API, since there are certainly other issues I
might run into.

function GetElementByName ($xml, $start, $end) {
$startpos = strpos($xml, $start);
if ($startpos === false) {
return false;
}
$endpos = strpos($xml, $end);
$endpos = $endpos+strlen($end);
$endpos = $endpos-$startpos;
$endpos = $endpos - strlen($end);
$tag = substr ($xml, $startpos, $endpos);
$tag = substr ($tag, strlen($start));

return $tag;
}

function XPathValue($XPath,$XML) {
$XPathArray = explode("/",$XPath);

$node = $XML;
while (list($key,$value) = each($XPathArray)) {
$node = GetElementByName($node, "<$value>", "</$value>");
}

return $node;
}

Navigation:

Next in forum: Re: How to get to know if file is currently being written by another app?
Prev in forum: Extracting body from HTML document?
Thread view: Re: Extracting body from HTML document?

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация