简体   繁体   English

使用lxml etree直接访问元素和属性

[英]access elements and attribs DIRECTLY using lxml etree

Given the following xml structure: 给定以下xml结构:

<root>
    <a>
        <from name="abc">
            <b>xxx</b>
            <c>yyy</c>
        </from>
        <to name="def">
            <b>blah blah</b>
            <c>another blah blah</c>
        </to>
    </a>
</root>

How can I access directly the value of "from.b" of each "a" without loading first "from" (with find()) of each "a"? 我如何直接访问每个“ a”的“ from.b”的值而不加载每个“ a”的第一个“ from”(使用find())?

As you can see there are exactly the same elements under "from" and "to". 如您所见,“ from”和“ to”下有完全相同的元素。 So the method findall() would not work as I have to differentiate where the value of "b" is coming from. 所以方法findall()无法工作,因为我必须区分“ b”的值来自何处。

I would like to get the method of direct access because if I have to load each child element (there is a lot) my code would be quite verbose. 我想获得直接访问的方法,因为如果我必须加载每个子元素(很多),我的代码将非常冗长。 And in addition in my case performance counts and I have a lot of XML docs to parse! 另外,就我而言,性能至关重要,我还有很多XML文档可以解析! So I have to find the fastest method to go through the document (and store the data into a DB) 所以我必须找到最快的方法来遍历文档(并将数据存储到数据库中)

Within each "a" element there is exactly 1 "from" element and within each "from" element there is exactly 1 "b" element. 在每个“ a”元素内恰好有1个“ from”元素,在每个“ from”元素内恰好有1个“ b”元素。

I have no problem to do this with lxml objectify, but I want to use etree because first I have to parse the XML document with etree because I have to validate first the xml schema against an XSD doc and I do not want to reparse the whole document again. 我使用lxml objectify做到这一点没有问题,但是我想使用etree,因为首先我必须使用etree解析XML文档,因为我必须首先针对XSD文档验证xml模式,并且我不想重新解析整个文档再次记录。

find (and findall ) lets you specify a path to elements as well, for example you can do: find (和findall )还可以指定元素的路径,例如,您可以执行以下操作:

root = ET.fromstring(input_xml)

for a in root.findall('a'):
    print(a, a.find('from/b').text)

assuming you do always have exactly one from and b element. 假设您确实总是有一个fromb元素。

otherwise, I might be tempted to use findall and do checks in Python code if this is designed to be more robust 否则,我可能会想使用findall并在Python代码中进行检查,如果这样做设计得更健壮

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM