ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Context: I'm trying to read a XML file and return the data as a dictionary where every xml element is a key and its value the value in the dictionary. Whenever I find an element nested inside another I want the value to be another dictionary, so I've made a function I can call recursively.
I use a xml.sax.ContentHandler to parse the document. Simplified, I have a list with elements as numbers and a list with values as numbers. In my thinking, every item in the list with elements that doesn't have a value must be an element with elements inside itself.
The parsing works, example lists:
Code:
elements = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]
values = [2,3,4,6,7,8,10,11,12,13,14,15,16,17]
element 0: root element of the document,
element 1: first element, contains 3 elements (2,3,4),
element 2: second element, has a value,
etc.
The code simplified:
Code:
def main():
elements = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]
values = [2,3,4,6,7,8,10,11,12,13,14,15,16,17]
fillDict(elements, values)
def fillDict(elements, values):
for indexer in elements:
if indexer in values:
print '-IN- indexer =', indexer, 'elements =', elements
else:
print '-NOT IN- indexer =', indexer, 'elements =', elements
fillDict(elements[indexer + 1:], values)
return
return
main()
Argh! It's the unholy alliance of iteration and recursion! But really, the problem you're experiencing is that indexer is the values in the list, not the indices. Hence, when you the list suddenly jumps from 1 to 3, it's because you were moving by (1 [value from indexer] + 1) = 2.
Judging by your code, I think what you want can be much more easily accomplished with a list comprehension:
Code:
[x for x in elements if x in values]
This returns a list of elements that are values. If that's not what you're looking for, please elaborate on your problem a bit more and I'll try to help.
My usual way of thinking about such things is heavily influenced by Lisp. Fortunately, so was Python's designer. A Python list is much more than an array!
To my way of thinking, an XML data structure is most properly represented by a list containing one three-tuple: (element_name, attributes_list, elements_list) This tuple represents the root-node of the XML structure.
Both the second and the third items are, themselves, lists. The elements_list is a list of zero or more three-tuples of the format previously described. The attributes_list is a list of zero or more two-tuples of the form (attribute_name, attribute_value), where attribute_value cannot be a list but must be a simple value.
If your purpose is to build a DOM-like data structure, an important part of your processing might involve a "scaffolding list," which is a push-down stack which contains references to "the nested set of things that you are presently building." As SAX notifies you that you are entering and leaving the nested structures, the topmost scaffolding-list entry tells you where you are. The scaffolding is completely consumed by the time the processing ends.
If you need to provide an index to the DOM structure, additional data structures can be built alongside the DOM to serve that purpose.
Thank you for your answers.
@ taylor_venable: I'm sure it was "the unholy alliance of iteration and recursion" (LOL) that plagued me, but as a (i.c. private) programmer I like to live on the edge and haven't found that many risky things in Python yet...
@ sundialsvcs: your comment made me re-asses what I wanted to do but I determined that another representation of the data wasn't useful to me. I tried to code a generic class that accepted a dict and returned a dict, with nothing more advanced as the original data types as attributes. You were right in saying that I was in fact building a DOM structure within the dict, so I didn't need to stick to SAX as a processing mechanism.
Since I already spent more time on this than anticipated, I took the easy way: I searched for code on the net and found an example which uses ElementTree. That was educational for me, solved my problem and provided an elegant and extensible way of dealing with XML. With my 'generic' class finished and put in a module, I don't need to bother with XML again.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.