简体   繁体   English

python lxml中的XQuery绝对路径

[英]XQuery absolute path in python lxml

I have an XML document from which I want to extract the absolute path to a specific node (mynode) for later use. 我有一个XML文档,我想从中提取到特定节点(mynode)的绝对路径以供以后使用。 I retrieve the node like this: 我像这样检索节点:

from StringIO import StringIO
from lxml import etree

xml = """
<a1>
    <b1>
        <c1>content1</c1>
    </b1>
    <b1>
        <c1>content2</c1>
    </b1>
</a1>"""
root = etree.fromstring(xml)

i = 0
mynode = root.xpath('//c1')[i]

In order to get the path I currently use 为了获得我目前使用的路径

ancestors = mynode.xpath('./ancestor::*')
p = ''.join( map( lambda x: '/' + x.tag , ancestors ) + [ '/' , mynode.tag ] )

p has now the value p现在具有值

/a1/b1/c1

However to store the path for later use I have to store the index i from the first code snippet aswell in order to retrieve the right node because an xpath query for p will contain both nodes c1. 但是,要存储路径以供以后使用,我还必须存储第一个代码段中的索引i,以便检索正确的节点,因为对p的xpath查询将包含两个节点c1。 I do not want to store that index. 我不想存储该索引。

What would be better is a path for xquery which has the index included. 最好是xquery包含索引的路径。 For the first c1 node it could look like this: 对于第一个c1节点,它可能如下所示:

/a1/b1[1]/c1

or this for the second c1 node 或第二个c1节点的这个

/a1/b1[2]/c1

Anyone an idea how this can be achieved? 有人知道如何实现吗? Is there another method to specify a node and access it later on? 还有另一种方法可以指定节点并以后再访问它吗?

from lxml import etree
from io import StringIO, BytesIO

# ----------------------------------------------

def node_location(node):
    position = len(node.xpath('./preceding-sibling::' + node.tag)) + 1
    return '/' + node.tag + '[' + str(position) + ']'

def node_path(node):
    nodes = mynode.xpath('./ancestor-or-self::*')
    return ''.join( map(node_location, nodes) )

# ----------------------------------------------

xml = """
<a1>
    <b1>
        <c1>content1</c1>
    </b1>
    <b1>
        <c1>content2</c1>
    </b1>
</a1>"""

root = etree.fromstring(xml)

for mynode in root.xpath('//c1'):
    print node_path(mynode)

prints 版画

/a1[1]/b1[1]/c1[1]
/a1[1]/b1[2]/c1[1]

Is there another method to specify a node and access it later on? 还有另一种方法可以指定节点并以后再访问它吗?

If you mean "persist across separate invocations of the program", then no, not really. 如果您的意思是“在程序的各个调用之间持久存在”,则不是,不是真的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM