简体   繁体   English

如何从没有xpath的元素中提取文本

[英]How to extract text from an element that does not have an xpath

I am trying to web scrape the dollar sign rating for each restaurant on a food delivery website, however, there is no available xpath. 我正在尝试在食品配送网站上通过网络刮擦每个餐厅的美元符号评级,但是,没有可用的xpath。

<!-- react-text: 2108 -->
"$$"
<!-- /react-text -->

The above code is what is used for the dollar ratings from when I inspected the website. 上面的代码是我检查网站时用于美元评级的代码。 I've tried using the line directly above: 我试过直接在上面的行:

    <i class="icon-bullet--small">·</i>

However, this outputs the period since it is not for the dollar rating. 但是,由于不是美元等级,因此会输出该期间。 I've also tried using: 我也尝试过使用:

    cost = ['//li[{}]/a/div[2]/p[2]/!'.format(x) for x in range(1, 999)]

as well as using "!--" and "react" and "react-text" in the xpath, but none of it works. 以及在xpath中使用“!-”,“ react”和“ react-text”,但是它们都不起作用。 Any suggestions on how to approach this? 关于如何处理此问题的任何建议?

This XPath, 这个XPath

//comment()[normalize-space() = "react-text: 2108"]/following-sibling::text()

will select the text node immediately following the targeted comment, returning 将在目标注释之后立即选择文本节点,返回

"$$"

as requested. 按照要求。


Important note: @DebanjanB has helpfully pointed out that the comment containing react-text: 2108 is a React directive that Selenium won't see unless the content is extracted as page_source . 重要说明: @DebanjanB用地指出,包含react-text: 2108的注释是一个React指令,除非将内容提取为page_source否则Selenium不会看到。 Thanks, Debanjan! 谢谢,Debanjan!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM