|
Posted by Andre-John Mas on 11/14/07 21:47
On Nov 14, 4:23 pm, Andre-John Mas <andrejohn....@gmail.com> wrote:
> Hi,
>
> I am wanting to be able to get a section of a HTML document, by
> specifying an XPath. For example:
>
> $title= GetSection ( '/html/head/title');
> $body= GetSection ( '/html/body');
>
> I made a simple parser myself some time back, but it is failing with
> certain types of documents. Instead of maintaining the code, I would
> reather find an existing solution, so that I can concentrate my
> development efforts elswhere. Does anyone have anything they can
> recommend?
>
> Andre
My current implementation is very basic. The main issue I am having is
that if there are any attributes associated with the start element,
then nothing is returned. While I can eventually solve this, I would
rather use a robust API, since there are certainly other issues I
might run into.
function GetElementByName ($xml, $start, $end) {
$startpos = strpos($xml, $start);
if ($startpos === false) {
return false;
}
$endpos = strpos($xml, $end);
$endpos = $endpos+strlen($end);
$endpos = $endpos-$startpos;
$endpos = $endpos - strlen($end);
$tag = substr ($xml, $startpos, $endpos);
$tag = substr ($tag, strlen($start));
return $tag;
}
function XPathValue($XPath,$XML) {
$XPathArray = explode("/",$XPath);
$node = $XML;
while (list($key,$value) = each($XPathArray)) {
$node = GetElementByName($node, "<$value>", "</$value>");
}
return $node;
}
Navigation:
[Reply to this message]
|