You are here: Re: [PHP] URL restriction on XML file « PHP « IT news, forums, messages
Re: [PHP] URL restriction on XML file

Posted by Marek Kilimajer on 03/30/05 12:32

That's because the character data is split on the borders of the
entities, so for

http://feeds.example.com/?rid=318045f7e13e0b66&cat=48cba686fe041718&f=1

characterData() will be called 5 times:

http://feeds.example.com/?rid=318045f7e13e0b66
&
cat=48cba686fe041718
&
f=1

Solution is inlined below

Roger Thomas wrote:
> I have a short script to parse my XML file. The parsing produces no error and all output looks good EXCEPT url-links were truncated IF it contain the '&' characters.
>
> My XML file looks like this:
> --- start of XML ---
> <?xml version="1.0" encoding="iso-8859-1"?>
> <rss version="2.0">
> <channel>
> <title>Test News .Net - Newspapers on the Net</title>
> <copyright>Small News Network.com</copyright>
> <link>http://www.example.com/</link>
> <description>Continuously updating Example News.</description>
> <language>en-us</language>
> <pubDate>Tue, 29 Mar 2005 18:01:01 -0600</pubDate>
> <lastBuildDate>Tue, 29 Mar 2005 18:01:01 -0600</lastBuildDate>
> <ttl>30</ttl>
> <item>
> <title>Group buys SunGard for US$10.4bil</title>
> <link>http://feeds.example.com/?rid=318045f7e13e0b66&amp;cat=48cba686fe041718&amp;f=1</link>
> <description>NEW YORK: A group of seven private equity investment firms agreed yesterday to buy financial technology company SunGard Data Systems Inc in a deal worth US$10.4bil plus debt, making it the biggest lev...</description>
> <source url="http://biz.theexample.com/">The Paper</source>
> </item>
> <item>
> <title>Strong quake hits Indonesia coast</title>
> <link>http://feeds.example.com/news/world/quake.html</link>
> <description>a &quot;widely destructive tsunami&quot; and the quake was felt as far away as Malaysia.</description>
> <source url="http://biz.theexample.com.net/">The Paper</source>
> </item>
> <item>
> <title>Final News</title>
> <link>http://feeds.example.com/?id=abcdef&amp;cat=somecat</link>
> <description>We are going to expect something new this weekend ...</description>
> <source url="http://biz.theexample.com/">The Paper</source>
> </item>
> </channel>
> </rss>
> --- end of XML ---
>
> For the sake of testing, my script only print out the url-link to those news above. I got these:
> f=1
> http://feeds.example.com/news/world/quake.html
> cat=somecat
>
> The output for line 1 is truncated to 'f=1' and the output of line 3 is truncated to 'cat=somecat'. ie, the script only took the last parameter of the url-link. The output for line 2 is correct since it has NO parameters.
>
> I am not sure what I have done wrong in my script. Is it bcos the RSS spec says that you cannot have parameters in URL ? Please advise.
>
> -- start of script --
> <?
> $file = "test.xml";
> $currentTag = "";
>
> function startElement($parser, $name, $attrs) {
> global $currentTag;
> $currentTag = $name;
> }
>
> function endElement($parser, $name) {
> global $currentTag, $TITLE, $URL, $start;
>
> switch ($currentTag) {
> case "ITEM":
> $start = 0;
> case "LINK":
> if ($start == 1)
> #print "<A HREF = \"".$URL."\">$TITLE</A><BR>";
> print "$URL"."<BR>";
> break;
> }
> $currentTag = "";

// Reset also other variables:
$URL = '';
$TITLE = '';

> }
>
> function characterData($parser, $data) {
> global $currentTag, $TITLE, $URL, $start;
>
> switch ($currentTag) {
> case "ITEM":
> $start = 1;
> case "TITLE":
> $TITLE = $data;

// append instead:
$TITLE .= $data;

> break;
> case "LINK":
> $URL = $data;

// append instead:
$URL .= $data;

// Warning: entities are decoded at this point, you will receive &, not
&amp;

> break;
> }
> }
>
> $xml_parser = xml_parser_create();
> xml_set_element_handler($xml_parser, "startElement", "endElement");
> xml_set_character_data_handler($xml_parser, "characterData");
>
> if (!($fp = fopen($file, "r"))) {
> die("Cannot locate XML data file: $file");
> }
>
> while ($data = fread($fp, 4096)) {
> if (!xml_parse($xml_parser, $data, feof($fp))) {
> die(sprintf("XML error: %s at line %d",
> xml_error_string(xml_get_error_code($xml_parser)),
> xml_get_current_line_number($xml_parser)));
> }
> }
>
> xml_parser_free($xml_parser);
>
> ?>
> -- end of script --
>
> TIA.
> Roger
>
>
> ---------------------------------------------------
> Sign Up for free Email at http://ureg.home.net.my/
> ---------------------------------------------------
>

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация