简体   繁体   中英

Nesting Dictionary from Word XML

I have a word file with forms in it. The goal is to scan through the XML (with lxml ) and produce a dictionary of {formTag:formValue} . It gets a little more complicated because forms can be nested in other repeating forms which initally produces

{topLevelFormTag:formTag1+formValue1+formTag2+formValue2,  formTag1:formValue1, formTag2:formValue2}

However, the goal is to end up with

{topLevelFormTag:{formTag1:formValue1, formTag2:formValue2}}

As I search through the file ( for field in xmlroot.iter(TAG_FIELD): ) I fill out two dictionaries; parents and descendants with parents[field] = field.getparents( ) and descendants[field] = list(field.iterdescendants()) . Below is my method for collapsing the dictionary of all fields into the a nested dictionary. If there is only one level of nesting, it works fine, however, it fails with additional levels. It fails because a nested form is in the descendants of all the levels above, so it could be placed as a child of any of the upper levels.

for ptag in parents:
    for dtag in descendants:
        if parents[ptag] in descendants[dtag]:
            print "{} is a descendant of {}".format(ptag, dtag)
            try:
                fields[dtag][ptag] = fields[ptag]
                del fields[ptag]
            except TypeError:
                fields[dtag] = {ptag: fields[ptag]}
                del fields[ptag]
            except KeyError:
                print "!!!{}:{}!!!".format(ptag, dtag)

How can I determine the bottom most level to place a field in such that my dictionary is nested correctly?

to find the last nest in any dictionary you have to use recursive relations:

def last_nest(somedict):
    for i in somedict:
        if type(somedict[i]) is dict:
            return last_nest(somedict[i])
    return somedict

test = {"a":{"b":{"c":123}}}
print last_nest(test)

So the main thing you need to think about is how to terminate the recursive relation in order to get the dict that you want at the end.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM