[英]Nesting Dictionary from Word XML
I have a word file with forms in it. 我有一个带有表格的Word文件。 The goal is to scan through the XML (with lxml
) and produce a dictionary of {formTag:formValue}
. 目的是浏览XML(使用lxml
)并生成{formTag:formValue}
的字典。 It gets a little more complicated because forms can be nested in other repeating forms which initally produces 由于表单可以嵌套在其他重复的表单中,而这些重复表单最初会生成
{topLevelFormTag:formTag1+formValue1+formTag2+formValue2, formTag1:formValue1, formTag2:formValue2}
However, the goal is to end up with 但是,最终目标是
{topLevelFormTag:{formTag1:formValue1, formTag2:formValue2}}
As I search through the file ( for field in xmlroot.iter(TAG_FIELD):
) I fill out two dictionaries; 当我搜索文件时( for field in xmlroot.iter(TAG_FIELD):
我填写了两个字典。 parents
and descendants
with parents[field] = field.getparents(
) and descendants[field] = list(field.iterdescendants())
. parents
和descendants
其中parents[field] = field.getparents(
)和descendants[field] = list(field.iterdescendants())
。 Below is my method for collapsing the dictionary of all fields into the a nested dictionary. 下面是我将所有字段的字典折叠为嵌套字典的方法。 If there is only one level of nesting, it works fine, however, it fails with additional levels. 如果只有一个嵌套级别,则可以正常工作,但是,如果添加其他级别,则失败。 It fails because a nested form is in the descendants of all the levels above, so it could be placed as a child of any of the upper levels. 它失败是因为嵌套形式位于上述所有级别的后代中,因此可以将其作为任何更高级别的子级放置。
for ptag in parents:
for dtag in descendants:
if parents[ptag] in descendants[dtag]:
print "{} is a descendant of {}".format(ptag, dtag)
try:
fields[dtag][ptag] = fields[ptag]
del fields[ptag]
except TypeError:
fields[dtag] = {ptag: fields[ptag]}
del fields[ptag]
except KeyError:
print "!!!{}:{}!!!".format(ptag, dtag)
How can I determine the bottom most level to place a field in such that my dictionary is nested correctly? 如何确定将字段放置在最底层,以便正确嵌套字典?
to find the last nest in any dictionary you have to use recursive relations: 要查找任何词典中的最后一个嵌套,您必须使用递归关系:
def last_nest(somedict):
for i in somedict:
if type(somedict[i]) is dict:
return last_nest(somedict[i])
return somedict
test = {"a":{"b":{"c":123}}}
print last_nest(test)
So the main thing you need to think about is how to terminate the recursive relation in order to get the dict that you want at the end. 因此,您需要考虑的主要问题是如何终止递归关系,以便最终获得所需的命令。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.