[英]How to get path of an element in lxml?
I'm searching in a HTML document using XPath from lxml in python.我正在使用 python 中 lxml 中的 XPath 在 HTML 文档中搜索。 How can I get the path to a certain element?
如何获取某个元素的路径? Here's the example from ruby nokogiri:
这是 ruby nokogiri 的示例:
page.xpath('//text()').each do |textnode|
path = textnode.path
puts path
end
print for example ' /html/body/div/div[1]/div[1]/p/text()[1] ' and this is the string I want to get in python.打印例如' /html/body/div/div[1]/div[1]/p/text()[1] ',这是我想在python中获得的字符串。
Use getpath
from ElementTree objects.使用 ElementTree 对象的
getpath
。
from lxml import etree
root = etree.fromstring('''
<foo><bar>Data</bar><bar><baz>data</baz>
<baz>data</baz></bar></foo>
''')
tree = etree.ElementTree(root)
for e in root.iter():
print(tree.getpath(e))
Prints印刷
/foo
/foo/bar[1]
/foo/bar[2]
/foo/bar[2]/baz[1]
/foo/bar[2]/baz[2]
See the Xpath and XSLT with lxml from the lxml documentation This gives the path of the element containg the text从 lxml 文档中查看带有 lxml的Xpath 和 XSLT这给出了包含文本的元素的路径
An example would be一个例子是
import cStringIO
from lxml import etree
f = cStringIO.StringIO('<foo><bar><x1>hello</x1><x1>world</x1></bar></foo>')
tree = lxml.etree.parse(f)
find_text = etree.XPath("//text()")
# and print out the required data
print [tree.getpath( text.getparent()) for text in find_text(tree)]
# answer I get is
>>> ['/foo/bar/x1[1]', '/foo/bar/x1[2]']
If all you have in your section of code is the element and you want the element's xpath do then element.getroottree().getpath(element)
will do the job.如果您的代码部分中只有元素,并且您希望元素的 xpath 执行,那么
element.getroottree().getpath(element)
将完成这项工作。
from lxml import etree
xml = '''
<test>
<a/>
<b>
<i/>
<ii/>
</b>
</test>
'''
tree = etree.fromstring(xml)
for element in tree.iter():
print element.getroottree().getpath(element)
root = etree.parse(open('tmp.txt'))
for e in root.iter():
print root.getpath(e)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.