简体   繁体   English

Python ElementTree:使用XPath通过其子文本查找元素

[英]Python ElementTree: find element by its child's text using XPath

I'm trying to locate an element that has certain text value in one of its child. 我正在尝试在其子节点中找到具有某些文本值的元素。 For example, 例如,

<peers>
    <peer>
        <offset>1</offset>
        <tag>TRUE</tag>
    </peer>
    <peer>
        <offset>2</offset>
        <tag>FALSE</tag>
    </peer>
</peers>

from this XML document I would like to directly locate tag in a peer element whose offset value is 1. 从这个XML文档中我想直接在offset值为1的peer元素中定位tag

So for that purpose I have a XPath expression as follows: 所以为此我有一个XPath表达式如下:

./peers/peer[offset='1']/tag

however using such expression in ElementTree's Element.find() method fails and gives None rather than the "tag" element of my interest: 但是在ElementTree的Element.find()方法中使用这样的表达式失败并且给出None而不是我感兴趣的“tag”元素:

from xml.etree.ElementTree import fromstring

doc = fromstring("<peers><peer><offset>1</offset><tag>TRUE</tag></peer><peer><offset>2</offset><tag>FALSE</tag></peer></peers>")

tag = doc.find("./peers/peer[offset='1']/tag")

print tag


=> None

I'm being inclined to believe it's either my above XPath expression is wrong, or due to ElementTree's supporting only a subset of XPath according to its documentation. 我倾向于认为它是我上面的XPath表达式错误,或者是由于ElementTree根据其文档仅支持XPath的一个子集。 Looking for help. 寻求帮助。 Thank you. 谢谢。

Using lxml.etree directly (the same should apply to ElementTree ), you can achieve the result like this: 使用lxml.etree直接(同样适用于ElementTree ),就可以实现这样的结果:

doc = lxml.etree.fromstring(...)
tag_elements = doc.xpath("/peers/peer/offset[text()='1']/../tag")

tag_elements will be the list of <tag> elements belonging to <peer> elements containing an <offset> element containing 1. tag_elements将是属于<peer>元素的<tag>元素列表 ,其中包含一个包含1的<offset>元素。

Given input (I've added a <peer> clause to emphasize tag_elements being a list): 给定输入(我添加了一个<peer>子句来强调tag_elements是一个列表):

<peers>
    <peer>
        <offset>1</offset>
        <tag>TRUE</tag>
    </peer>
    <peer>
        <offset>1</offset>
        <tag>OTHER</tag>
    </peer>
    <peer>
        <offset>2</offset>
        <tag>FALSE</tag>
    </peer>
</peers>

tag_elements will contain two elements: tag_elements将包含两个元素:

for tag in tag_elements:
    print tag.text
-> TRUE
-> OTHER

UPDATE : 更新

doc.xpath("/peers/peer[offset=1]/tag") also works fine. doc.xpath("/peers/peer[offset=1]/tag")也可以正常工作。

But doc.xpath("./peers/peer[offset=1]/tag") does not. 但是doc.xpath("./peers/peer[offset=1]/tag")没有。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM