[英]Locating the right tag in HTML while webscraping in python
Im working on a project for school were I display the current price for bitcoin, eth and maybe another and im web scraping https://cryptowat.ch/ but I cant find the tag used to store the live price. 我正在为一个学校的项目工作,我显示了比特币,ETH的当前价格,也许还显示了当前价格,并且我在网上抓取了https://cryptowat.ch/,但我找不到用于存储实时价格的标签。 when i parse the div tag it returns the price but im not able to isolate it so i can display it in python 当我解析div标签时,它返回价格,但是我无法隔离它,因此我可以在python中显示它
<div class="rankings-col__header__segment"><h2>BTC</h2><weak>usd </weak>10857.00</div>
From what I understand - you know the BTC
string and can use it to base your locator. 据我了解-您知道BTC
字符串,可以使用它作为您的定位器的基础。
So, if it would be XPath, you can use that and following-sibling::text()
: 因此,如果它将是XPath,则可以使用它和following-sibling::text()
:
//h2[. = 'BTC']/following-sibling::text()
Example using lxml.html
: 使用lxml.html
示例:
from lxml.html import fromstring
data = """<div class="rankings-col__header__segment"><h2>BTC</h2><weak>usd </weak>10857.00</div>"""
root = fromstring(data)
print(root.xpath("//h2[. = 'BTC']/following-sibling::text()"))
Prints ['10857.00']
. 打印['10857.00']
。
If, by any chance, you use BeautifulSoup
, it would be: 如果您有机会使用BeautifulSoup
,它将是:
from bs4 import BeautifulSoup
data = """<div class="rankings-col__header__segment"><h2>BTC</h2><weak>usd </weak>10857.00</div>"""
soup = BeautifulSoup(data, "html.parser")
print(soup.find("h2", string="BTC").find_next_sibling(text=True))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.