Reply to Re: TCL/PHP/XML problem: I need to convert an XML file into a TCL list

Your name:

Reply:


Posted by Bryan Oakley on 11/25/06 19:04

comp.lang.tcl wrote:
> ... I do not understand XML parsing at all, DOM, SAX, xQuery,
> X[whatever else], none of it makes sense to me and I cannot understand
> the books, tutorials and online guides (not to mention the tips from
> far smarter people than I) that has barraged me; all of it makes
> absolutely no sense to me.
>
> At this point the only way I can parse XML is using PHP, because that's
> literally the ONLY way I can do it, period!
>
> But if you want to see a sample of the XML I'm working with, here it
> is:
>
> <?xml version="1.0" encoding="utf-8" ?><trivia><entry id="1101"
> triviaID="233" question="Who wrote &quot;Trilogy of Knowledge&quot;?"
> answerID="1" correctAnswerID="1" answer="Believer"
> expDate="1139634000"></entry><entry id="1102" triviaID="233"
> question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="2"
> correctAnswerID="1" answer="Saviour Machine"
> expDate="1139634000"></entry><entry id="1103" triviaID="233"
> question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="3"
> correctAnswerID="1" answer="Seventh Avenue"
> expDate="1139634000"></entry><entry id="1104" triviaID="233"
> question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="4"
> correctAnswerID="1" answer="Inevitable End"
> expDate="1139634000"></entry><entry id="1105" triviaID="233"
> question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="5"
> correctAnswerID="1" answer="No such song existed"
> expDate="1139634000"></entry>

That data is mal-formed XML. For example, you are missing the closing
</trivia> tag.

Here's a solution that works with the above data. I've mentioned the
"xml2list" proc a couple of times, but with the sample data I see your
data will need a little extra pre-processing.

Step 1: copy the proc "xml2list" from this page: http://mini.net/tcl/3919

Second, enter the following, which is taking the above data verbatim and
storing it in a variable:

set data {<?xml version="1.0" encoding="utf-8" ?><trivia><entry
id="1101"
triviaID="233" question="Who wrote &quot;Trilogy of Knowledge&quot;?"
answerID="1" correctAnswerID="1" answer="Believer"
expDate="1139634000"></entry><entry id="1102" triviaID="233"
question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="2"
correctAnswerID="1" answer="Saviour Machine"
expDate="1139634000"></entry><entry id="1103" triviaID="233"
question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="3"
correctAnswerID="1" answer="Seventh Avenue"
expDate="1139634000"></entry><entry id="1104" triviaID="233"
question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="4"
correctAnswerID="1" answer="Inevitable End"
expDate="1139634000"></entry><entry id="1105" triviaID="233"
question="Who wrote &quot;Trilogy of Knowledge&quot;?" answerID="5"
correctAnswerID="1" answer="No such song existed"
expDate="1139634000"></entry>}

Your data is missing an ending </trivia> tag, so we have to add it for
this specific example. I don't know if this is a problem you'll have to
solve with your full dataset. Also, the xml2list proc doesn't like the
leading <?xml...> stuff. So, let's modify your data:

# remove the leading <?xml...?> data
regexp {<\?.*?\?>(.*$)} $data -- data

# add a trailing </trivia> which is missing from
# the sample data
set data "$data</trivia>"

And now, convert it to a list and print it out:

set result [xml2list $data]
puts $result

If you didn't introduce any typos, you'll get the following output:

trivia {} {{entry {id 1101 triviaID 233 question {Who wrote
&quot;Trilogy of Knowledge&quot;?} answerID 1 correctAnswerID 1 answer
Believer expDate 1139634000} {}} {entry {id 1102 triviaID 233 question
{Who wrote &quot;Trilogy of Knowledge&quot;?} answerID 2 correctAnswerID
1 answer {Saviour Machine} expDate 1139634000} {}} {entry {id 1103
triviaID 233 question {Who wrote &quot;Trilogy of Knowledge&quot;?}
answerID 3 correctAnswerID 1 answer {Seventh Avenue} expDate 1139634000}
{}} {entry {id 1104 triviaID 233 question {Who wrote &quot;Trilogy of
Knowledge&quot;?} answerID 4 correctAnswerID 1 answer {Inevitable End}
expDate 1139634000} {}} {entry {id 1105 triviaID 233 question {Who wrote
&quot;Trilogy of Knowledge&quot;?} answerID 5 correctAnswerID 1 answer
{No such song existed} expDate 1139634000} {}}}

The above is a valid tcl list that you can now process with normal tcl
list-handling commands. Do *not* process this list with string
transformations (such as converting &quot; to a quote). If you do, you
run the risk of breaking it's list-ness. Instead, loop over the data and
do the conversion as a final step on a element-by-element basis.

Does this help? It's not robust; the xml2list assumes you have proper
xml with a balanced set of tags (or in the specific case in this
message, with a missing </trivia> tag). Hopefully, though, it will at
least get you started.

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация