简体   繁体   English

递归迭代后的返回值 XML

[英]Returning value after recursively iterating through XML

I'm working with a very nested XML file and the path is critical for understanding.我正在处理一个非常嵌套的 XML 文件,该路径对于理解至关重要。 This answer enables me to print both the path and value: Python xml absolute path这个答案使我能够打印路径和值: Python xml 绝对路径

What I can't figure out is how to output the result in a more usable way (trying to construct a dataframe listing Path and Value).我想不通的是如何以更有用的方式 output 结果(尝试构建 dataframe 列表路径和值)。

For example, from the linked example:例如,从链接的例子:

<A>
  <B>foo</B>
  <C>
    <D>On</D>
  </C>
  <E>Auto</E>
  <F>
    <G>
      <H>shoo</H>
      <I>Off</I>
    </G>
  </F>
</A>

from lxml import etree
root = etree.XML(your_xml_string)

def print_path_of_elems(elem, elem_path=""):
    for child in elem:
        if not child.getchildren() and child.text:
            # leaf node with text => print
            print "%s/%s, %s" % (elem_path, child.tag, child.text)
        else:
            # node with child elements => recurse
            print_path_of_elems(child, "%s/%s" % (elem_path, child.tag))

print_path_of_elems(root, root.tag)

Results in the following printout:产生以下打印输出:

/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off

I believe yield is the correct technique but I'm getting no where, current attempt returns nothing:我相信 yield 是正确的技术,但我不知道在哪里,当前的尝试没有返回任何结果:

from lxml import etree
root = etree.XML(your_xml_string)

def yield_path_of_elems(elem, elem_path=""):
    for child in elem:
        if not child.getchildren() and child.text:
            ylddict = {'Path':elem_path, 'Value':child.text}
            yield(ylddict)
        else:
            # node with child elements => recurse
            yield_path_of_elems(child, "%s/%s" % (elem_path, child.tag))

for i in yield_path_of_elems(root):
    #print for simplicity in example, otherwise turn into DF and concat
    print(i)

From experimenting I believe when I use yield or return the recursion doesn't function correctly.通过实验,我相信当我使用 yield 或 return 时,递归不正确 function 。

You need to pass the values yielded by the recursive call back to the original caller.您需要将递归调用产生的值传递回原始调用者。 So change:所以改变:

yield_path_of_elems(child, "%s/%s" % (elem_path, child.tag))

to

yield from yield_path_of_elems(child, "%s/%s" % (elem_path, child.tag))

This is analogous to the way you have to use return recursive_call(...) in a normal recursive function.这类似于您必须在正常递归 function 中使用return recursive_call(...)的方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM