Help answer threads with 0 replies.
Go Back > Forums > Non-*NIX Forums > Programming
User Name
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.


  Search this Thread
Old 12-23-2006, 10:01 AM   #1
Caesar Tjalbo
Registered: Aug 2006
Location: Ņuņoa
Distribution: Aptosid
Posts: 93

Rep: Reputation: 16
Question [Python]Logical problem working with lists

Context: I'm trying to read a XML file and return the data as a dictionary where every xml element is a key and its value the value in the dictionary. Whenever I find an element nested inside another I want the value to be another dictionary, so I've made a function I can call recursively.

I use a xml.sax.ContentHandler to parse the document. Simplified, I have a list with elements as numbers and a list with values as numbers. In my thinking, every item in the list with elements that doesn't have a value must be an element with elements inside itself.

The parsing works, example lists:
    elements = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]
    values = [2,3,4,6,7,8,10,11,12,13,14,15,16,17]
element 0: root element of the document,
element 1: first element, contains 3 elements (2,3,4),
element 2: second element, has a value,

The code simplified:
def main():
    elements = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]
    values = [2,3,4,6,7,8,10,11,12,13,14,15,16,17]
    fillDict(elements, values)

def fillDict(elements, values):
    for indexer in elements:
        if indexer in values:
            print '-IN- indexer =', indexer, 'elements =', elements
            print '-NOT IN- indexer =', indexer, 'elements =', elements
            fillDict(elements[indexer + 1:], values)

The output:
-NOT IN- indexer = 0 elements = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
-NOT IN- indexer = 1 elements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
-IN- indexer = 3 elements = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
-IN- indexer = 4 elements = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
-NOT IN- indexer = 5 elements = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
-NOT IN- indexer = 9 elements = [9, 10, 11, 12, 13, 14, 15, 16, 17]
It skips element 2, 6, 7 etc. I really don't see why the 'indexer' jumps from 1 to 3 and from 5 to 9. Does anybody see what I'm missing here?
Old 12-23-2006, 10:42 AM   #2
Registered: Jun 2005
Location: Indiana, USA
Distribution: OpenBSD, Ubuntu
Posts: 892

Rep: Reputation: 43
Argh! It's the unholy alliance of iteration and recursion! But really, the problem you're experiencing is that indexer is the values in the list, not the indices. Hence, when you the list suddenly jumps from 1 to 3, it's because you were moving by (1 [value from indexer] + 1) = 2.

Judging by your code, I think what you want can be much more easily accomplished with a list comprehension:
[x for x in elements if x in values]
This returns a list of elements that are values. If that's not what you're looking for, please elaborate on your problem a bit more and I'll try to help.
Old 12-23-2006, 07:53 PM   #3
LQ Guru
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 7,857

Rep: Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539Reputation: 2539
My usual way of thinking about such things is heavily influenced by Lisp. Fortunately, so was Python's designer. A Python list is much more than an array!

To my way of thinking, an XML data structure is most properly represented by a list containing one three-tuple: (element_name, attributes_list, elements_list) This tuple represents the root-node of the XML structure.

Both the second and the third items are, themselves, lists. The elements_list is a list of zero or more three-tuples of the format previously described. The attributes_list is a list of zero or more two-tuples of the form (attribute_name, attribute_value), where attribute_value cannot be a list but must be a simple value.

If your purpose is to build a DOM-like data structure, an important part of your processing might involve a "scaffolding list," which is a push-down stack which contains references to "the nested set of things that you are presently building." As SAX notifies you that you are entering and leaving the nested structures, the topmost scaffolding-list entry tells you where you are. The scaffolding is completely consumed by the time the processing ends.

If you need to provide an index to the DOM structure, additional data structures can be built alongside the DOM to serve that purpose.
Old 12-27-2006, 03:58 AM   #4
Caesar Tjalbo
Registered: Aug 2006
Location: Ņuņoa
Distribution: Aptosid
Posts: 93

Original Poster
Rep: Reputation: 16
Thank you for your answers.
@ taylor_venable: I'm sure it was "the unholy alliance of iteration and recursion" (LOL) that plagued me, but as a (i.c. private) programmer I like to live on the edge and haven't found that many risky things in Python yet...
@ sundialsvcs: your comment made me re-asses what I wanted to do but I determined that another representation of the data wasn't useful to me. I tried to code a generic class that accepted a dict and returned a dict, with nothing more advanced as the original data types as attributes. You were right in saying that I was in fact building a DOM structure within the dict, so I didn't need to stick to SAX as a processing mechanism.

Since I already spent more time on this than anticipated, I took the easy way: I searched for code on the net and found an example which uses ElementTree. That was educational for me, solved my problem and provided an elegant and extensible way of dealing with XML. With my 'generic' class finished and put in a module, I don't need to bother with XML again.


python, xml

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
cant seem to get mod-python working nephish Linux - Server 0 10-14-2006 08:23 AM
python problem - compiled from source - python -V still showing old version txm123 Linux - Newbie 1 02-15-2006 11:05 AM
array logical operations in python? zero79 Programming 2 05-13-2005 03:56 AM
Problem with updatin lists.... alaios Debian 11 10-04-2004 09:11 AM
idle for python not working bingbang Linux - Newbie 0 03-28-2004 07:44 AM

All times are GMT -5. The time now is 02:17 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration