|
Posted by Pavel Lepin on 01/04/08 13:03
Richard Price <news@directroute.co.uk> wrote in
<13ns7b2tg7ug6fa@corp.supernews.com>:
> I am updating a xml parser developed by a third party. The
> xml below is parsed by the system using the code that
> follows:
>
> $extract = parseXMLForLtd($xml, $type);
> function parseXMLForLtd($xml="", $type) {
> preg_match_all( "/\<section(.*?)\<\/section\>/s",
> $xml, $reportData); $reportData[0] =
> preg_replace('/\n/i','',$reportData[0]); $result =
> $reportData[0]; for($i=0; $i<count($result);$i++) {
> $item = trim(stripslashes($result[$i]));
>
> if(eregi('<section id="name">',$item) && eregi('<section
> id="company
> identification">',$item)) {
> $extract[name] = preg_replace('/([<][\/a-zA-Z0-9
> ="-]+[>])/i', '',
> $item);
OMG.
> However when I try to parse a the other field called
> "name" using:
>
> else if(eregi('<section id="name">',$item) &&
> eregi('<section id="secretary">',$item)) {
> $extract[secretary] =
> preg_replace('/([<][\/a-zA-Z0-9 ="-]+[>])/i',
> '', $item);
>
> It does not work. Could anyone please advise how I can
> extract "Fred Bloggs"?
Parsing hierarchical markup languages using regexen is an
exercise in futility, if not worse.
<http://www.php.net/manual/en/ref.dom.php>
An XPath expression fetching the node you need would be:
/Company/section[@id='officers']/
section[@id='secretary']/section[@id='name']/text()
--
....also, I submit that we all must honourably commit seppuku
right now rather than serve the Dark Side by producing the
HTML 5 spec.
Navigation:
[Reply to this message]
|