Python Selenium如何在跨度后从div获取文本

Question

I want to select text within a div after a span. 我想在跨度后在div中选择文本。

Source looks like this: 来源看起来像这样：

<div id="citation">
    <cite>Journal</cite>
    ", "
    <span class="year">2014</span>
    ", "
    <span class="volume">100</span>
    " (4), pp 100-200"
</div>

I only want the " (4), pp 100-200". 我只想要“（4），第100-200页”。

I know how to get text out of the entire div, or each span, but how do I grab only the last text? 我知道如何从整个div或每个跨度中获取文本，但是如何仅获取最后一个文本？ This XPATH will not work. 此XPATH将不起作用。 ISSUE_XPATH = "//*[@id=\\"citation\\"]/text()[3]" ISSUE_XPATH =“ // * [@ id = \\” citation \\“] / text（）[3]”

And shows this error message: 并显示此错误消息：

selenium.common.exceptions.InvalidSelectorException: Message: {"errorMessage":"The result of the xpath expression \\"//*[@id=\\"citation\\"]/text()[3]\\" is: [object Text]. It should be an element." selenium.common.exceptions.InvalidSelectorException：消息：{“ errorMessage”：“ xpath表达式\\” // * [@ id = \\“ citation \\”] / text（）[3] \\“的结果是：[对象文字]。应该是一个元素。”

Answer 1

Unfortunately, //*[@id=\\"citation\\"]/text()[3] is not going to work in selenium - you can only target actual elements, not text nodes. 不幸的是， //*[@id=\\"citation\\"]/text()[3]在硒中不起作用-您只能定位实际元素，而不能定位文本节点。

What I would do in this case is to additionally use BeautifulSoup HTML parser which would help to locate a specific text sibling after the span element with class="volume" : 在这种情况下，我要做的是另外使用BeautifulSoup HTML解析器，该解析器将帮助在span元素后使用class="volume"定位特定的文本同级：

from bs4 import BeautifulSoup

citation = driver.find_element_by_id("citation")
html = citation.get_attribute("outerHTML")

soup = BeautifulSoup(html, "html.parser")
desired_text = soup.find("span", class_="volume").next_sibling
print(desired_text)

Python Selenium如何在跨度后从div获取文本

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-04-06 14:54:39

Python Selenium如何在跨度后从div获取文本

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-04-06 14:54:39

解决方案1
2 已采纳 2016-04-06 14:54:39