简体   繁体   English

lxml-如何获取HtmlElement的xpath?

[英]lxml - how to get xpath of HtmlElement?

Using an XML etree, it's possible to do: 使用XML etree,可以执行以下操作:

etree.getpath(element

How would I do the same thing, but with HTML instead of XML? 如何用HTML而不是XML来做同样的事情?

The _ElementTree has a getpath method: _ElementTree具有getpath方法:

In [17]: import lxml.html as LH
In [18]: content = '<root><div id="pgbrk" ......>....Page Break....</div></root>'

In [19]: root = LH.fromstring(content)

In [20]: tree = root.getroottree()

In [21]: tree.getpath(root[0])
Out[21]: '/html/body/root/div'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM