| 
	
 | 
 Posted by Andre-John Mas on 11/14/07 21:47 
On Nov 14, 4:23 pm, Andre-John Mas <andrejohn....@gmail.com> wrote: 
> Hi, 
> 
> I am wanting to be able to get a section of a HTML document, by 
> specifying an XPath. For example: 
> 
>     $title= GetSection ( '/html/head/title'); 
>     $body= GetSection ( '/html/body'); 
> 
> I made a simple parser myself some time back, but it is failing with 
> certain types of documents. Instead of maintaining the code, I would 
> reather find an existing solution, so that I can concentrate my 
> development efforts elswhere. Does anyone have anything they can 
> recommend? 
> 
> Andre 
 
My current implementation is very basic. The main issue I am having is 
that if there are any attributes associated with the start element, 
then nothing is returned. While I can eventually solve this, I would 
rather use a robust API, since there are certainly other issues I 
might run into. 
 
 function GetElementByName ($xml, $start, $end) { 
   $startpos = strpos($xml, $start); 
   if ($startpos === false) { 
     return false; 
   } 
   $endpos = strpos($xml, $end); 
   $endpos = $endpos+strlen($end); 
   $endpos = $endpos-$startpos; 
   $endpos = $endpos - strlen($end); 
   $tag = substr ($xml, $startpos, $endpos); 
   $tag = substr ($tag, strlen($start)); 
 
   return $tag; 
 } 
 
 function XPathValue($XPath,$XML) { 
   $XPathArray = explode("/",$XPath); 
 
   $node = $XML; 
   while (list($key,$value) = each($XPathArray)) { 
     $node = GetElementByName($node, "<$value>", "</$value>"); 
   } 
 
   return $node; 
 }
 
  
Navigation:
[Reply to this message] 
 |