[英]lxml invalid predicate when using text()
I'm using lxml to do HTML screen scraping and I need to select an element by text()
, in a similar way to what is done on another question with pure XML , however no matter what happens I'm getting invalid predicate errors. 我正在使用lxml进行HTML屏幕抓取,我需要通过
text()
选择一个元素,其方式与使用纯XML在另一个问题上所做的类似,但无论发生什么,我都会得到无效的谓词错误。 I've simplified it down to this example: 我把它简化为这个例子:
import lxml.html
sample_html = "<div><h2>test string</h2><h2>other string</h2></div>"
sample_tree = lxml.html.fromstring(sample_html)
sample_tree.findall('.//h2[text()="test string"]')
While this should be valid, I continually get the error: 虽然这应该是有效的,但我不断得到错误:
File "<string>", line unknown
SyntaxError: invalid predicate
Any hints on how to properly get lxml to select an element by text()
when parsing HTML? 在解析HTML时,有关如何正确获取lxml以通过
text()
选择元素的任何提示?
The expression itself is valid , but you have to use the .xpath()
method instead: 表达式本身是有效的 ,但您必须使用
.xpath()
方法:
sample_tree.xpath('.//h2[text()="text string"]')
Note that you may also use .
请注意,您也可以使用
.
in place of text()
in this case: 在这种情况下代替
text()
:
.//h2[. = "text string"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.