[英]Returning value after recursively iterating through XML
I'm working with a very nested XML file and the path is critical for understanding.我正在处理一个非常嵌套的 XML 文件,该路径对于理解至关重要。 This answer enables me to print both the path and value: Python xml absolute path这个答案使我能够打印路径和值: Python xml 绝对路径
What I can't figure out is how to output the result in a more usable way (trying to construct a dataframe listing Path and Value).我想不通的是如何以更有用的方式 output 结果(尝试构建 dataframe 列表路径和值)。
For example, from the linked example:例如,从链接的例子:
<A>
<B>foo</B>
<C>
<D>On</D>
</C>
<E>Auto</E>
<F>
<G>
<H>shoo</H>
<I>Off</I>
</G>
</F>
</A>
from lxml import etree
root = etree.XML(your_xml_string)
def print_path_of_elems(elem, elem_path=""):
for child in elem:
if not child.getchildren() and child.text:
# leaf node with text => print
print "%s/%s, %s" % (elem_path, child.tag, child.text)
else:
# node with child elements => recurse
print_path_of_elems(child, "%s/%s" % (elem_path, child.tag))
print_path_of_elems(root, root.tag)
Results in the following printout:产生以下打印输出:
/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off
I believe yield is the correct technique but I'm getting no where, current attempt returns nothing:我相信 yield 是正确的技术,但我不知道在哪里,当前的尝试没有返回任何结果:
from lxml import etree
root = etree.XML(your_xml_string)
def yield_path_of_elems(elem, elem_path=""):
for child in elem:
if not child.getchildren() and child.text:
ylddict = {'Path':elem_path, 'Value':child.text}
yield(ylddict)
else:
# node with child elements => recurse
yield_path_of_elems(child, "%s/%s" % (elem_path, child.tag))
for i in yield_path_of_elems(root):
#print for simplicity in example, otherwise turn into DF and concat
print(i)
From experimenting I believe when I use yield or return the recursion doesn't function correctly.通过实验,我相信当我使用 yield 或 return 时,递归不正确 function 。
You need to pass the values yielded by the recursive call back to the original caller.您需要将递归调用产生的值传递回原始调用者。 So change:所以改变:
yield_path_of_elems(child, "%s/%s" % (elem_path, child.tag))
to到
yield from yield_path_of_elems(child, "%s/%s" % (elem_path, child.tag))
This is analogous to the way you have to use return recursive_call(...)
in a normal recursive function.这类似于您必须在正常递归 function 中使用return recursive_call(...)
的方式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.