Posted by php on 10/18/01 11:28
My company recently installed google's search appliance and I am working
on some scripts to display the search results on our various websites.
The problem I'm having is using the XML parsing functions I've used on
other pages is not working because the returned XML has invalid xml
characters in it. For example, the data between the xml tags include
html tags that are supposed to be displayed. But when I parse the xml,
the parse sees these tags as new start tags. Is there a way to work
around this or a different way to parse this document? I've heard a
little about XSLT really don't know anything about it and am wondering
if that is the way to deal with it?
Here is a part of the XML returned by the google appliance:
<GSP VER="3.2">
<TM>0.008398</TM>
<Q>information services</Q>
<PARAM name="q" value="information services"
original_value="information+services"/>
<PARAM name="site" value="shpolicy" original_value="shpolicy"/>
<PARAM name="client" value="shpolicy" original_value="shpolicy"/>
<PARAM name="output" value="xml_no_dtd" original_value="xml_no_dtd"/>
<PARAM name="btnG" value="Google_Search"
original_value="Google+Search"/>
<PARAM name="ip" value="10.2.4.44" original_value="10.2.4.44"/>
<PARAM name="access" value="p" original_value="p"/>
-
<RES SN="1" EN="10">
<M>86</M>
<FI/>
-
<NB>
-
<NU>
/search?q=information+services&site=shpolicy&hl=en&output=xml_no_dtd&client=shpolicy&access=p&sort=date:D:L:d1&start=10&sa=N
</NU>
</NB>
-
<R N="1" MIME="application/pdf">
-
<U>
http://shpolicy.shservices.org/administrative/InformationServices/housewideapplicable/Information%20Services%20Software%20Purchasing%20Policy.pdf
</U>
-
<UE>
http://shpolicy.shservices.org/administrative/InformationServices/housewideapplicable/Information%2520Services%2520Software%2520Purchasing%2520Policy.pdf
</UE>
-
<T>
<b>Information</b> <b>Services</b> Software Purchasing
</T>
<RK>5</RK>
<FS NAME="date" VALUE="2005-09-07"/>
-
<S>
<b>...</b> Administrative Housewide Policy <b>Information</b>
<b>Services</b> Software Purchasing Applicable<br> Campus: Salem and
West Valley Hospitals Department Name: <b>Information</b> <b>...</b>
</S>
-
<HAS>
<L/>
<C SZ="" CID="4_wracnOVC8:"/>
</HAS>
</R>
I can send the parsing code but it's fairly straight forward and I
didn't want to needlessly fill up the email.
Any suggestions?
Thanks,
Robbert van Andel
Navigation:
[Reply to this message]
|