[英]Extracting full text from HTML span element with XPath expression
I have a HTML tree which looks like this:我有一个看起来像这样的 HTML 树:
<div id="RF4FOEQ3OPBEX" data-hook="review" class="a-section review aok-relative"><div
<div data-hook="review-collapsed" aria-expanded="false" class="a-expander-content reviewText review-text-content a-expander-partial-collapse-content">
<span>
Text line1.
<br>
Text line2.
</span>
I am trying to extract all the text from the span with the following XPath expression:我正在尝试使用以下 XPath 表达式从跨度中提取所有文本:
//div[@data-hook="review"]//div[@data-hook="review-collapsed"]/span/text()
However this approach only returns the first text line until the break?但是这种方法只返回第一个文本行直到中断? The question is: how would I approach this problem in the correct way in order to extract the full text content of the HTML span tag?
问题是:为了提取 HTML 跨度标签的全文内容,我将如何以正确的方式解决这个问题? I would appreciate any help very much and thank you in advance for the support.
非常感谢您的帮助,并提前感谢您的支持。
use //
and getall
method to get all text inside specific element使用
//
和getall
方法获取特定元素内的所有文本
getall
returns list, just join
it getall
返回列表, join
它
txt = "".join(response.xpath('//div[@data-hook="review"]//div[@data-hook="review-collapsed"]/span//text()').getall())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.