简体   繁体   English

使用text()时lxml无效谓词

[英]lxml invalid predicate when using text()

I'm using lxml to do HTML screen scraping and I need to select an element by text() , in a similar way to what is done on another question with pure XML , however no matter what happens I'm getting invalid predicate errors. 我正在使用lxml进行HTML屏幕抓取,我需要通过text()选择一个元素,其方式与使用纯XML在另一个问题上所做的类似,但无论发生什么,我都会得到无效的谓词错误。 I've simplified it down to this example: 我把它简化为这个例子:

import lxml.html
sample_html = "<div><h2>test string</h2><h2>other string</h2></div>"
sample_tree = lxml.html.fromstring(sample_html)
sample_tree.findall('.//h2[text()="test string"]')

While this should be valid, I continually get the error: 虽然这应该是有效的,但我不断得到错误:

  File "<string>", line unknown
SyntaxError: invalid predicate

Any hints on how to properly get lxml to select an element by text() when parsing HTML? 在解析HTML时,有关如何正确获取lxml以通过text()选择元素的任何提示?

The expression itself is valid , but you have to use the .xpath() method instead: 表达式本身是有效的 ,但您必须使用.xpath()方法:

sample_tree.xpath('.//h2[text()="text string"]')

Note that you may also use . 请注意,您也可以使用. in place of text() in this case: 这种情况下代替text()

.//h2[. = "text string"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM