|
Posted by Steve Ball on 11/26/06 21:04
> I've said this over and over again: it's not that I won't take the
> time, it's that I CANNOT do it because I have a learning disability
> called Attention Deficit Disorder which psychologically does not permit
> me to take the time to learn it.
I'm sorry to hear that you have problems, but is that an excuse not to
RTFM?
I'll try and helpout with a simple solution. Note that your input XML
and the result that you want is very, how do I put it?, weird. This
solution will not suit many other applications (iow, don't try this at
home, kids).
Your source XML looks like this:
> <?xml version="1.0" encoding="utf-8" ?><trivia><entry id="1101"
> triviaID="233" question="Who wrote "Trilogy of Knowledge"?"
> answerID="1" correctAnswerID="1" answer="Believer"
> expDate="1139634000"></entry>
....
So, a bunch of <entry> elements where the real data is in attributes.
This looks like a dump of a relational database to me.
> Into this:
>
> id 1101 triviaID 233 question {Who wrote "Trilogy of
> Knowledge"?} answerID 1 correctAnswerID 1 answer Believer expDate
> 113963500
Yup, and then you're recreating the relational data in Tcl.
NB. XML is not the problem here - you may as well have just used CSV
rather than XML.
A SAX-style interface is problem the easiest way to get what you want.
Here's the solution:
package require xml
namespace eval xml2list {
namespace export convert
variable accumulator
}
proc xml2list::convert xml {
variable accumulator
# we only need to know about the start of elements,
# so create a parser and set the start-element callback
set parser [xml::parser -elementstartcommand [namespace code Start]]
set accumulator {}
# let 'er rip
$parser parse $xml ;# not handling errors
# the result will be in the accumulator
return $accumulator
}
# This procedure gets called for every start tag
proc xml2list::Start {tag attlist args} {
variable accumulator
# the parser has already done the work of turning the
# attributes into a list
lappend accumulator $attlist
return {}
}
### end of script
Notes:
1. This solution is single-threaded. It is an exercise for the reader
to have it run multi-threaded.
2. Attributes as Tcl lists is probably not ideal - a dict would be
better. However, in this case lists are necessary to represent the
relational data.
HTHs,
Steve Ball
[Back to original message]
|