[英]How to filter values extracted from HTML table
I want to scrape stats and odds from betexplorer, in particular this site https://www.betexplorer.com/soccer/russia/premier-league-2019-2020/cska-moscow-fc-tambov/8Ya3mpOC/我想从 betexplorer 获取统计数据和赔率,特别是这个网站https://www.betexplorer.com/soccer/russia/premier-league-2019-2020/cska-moscow-fc-tambov/8Ya3mpOC/
homeodd = driver.find_element_by_xpath("//*[contains(@id,'aodds')]").text
I have this return我有这个回报
CLOSING ODDS
22.07. 17:48 1.28 +0.01
OPENING ODDS
17.07. 00:07 1.27
How could I scrape only the opening odds?我怎么能只刮开开盘赔率?
Try more specific XPath to get Opening Odds only:尝试更具体的 XPath 以获得仅开盘赔率:
xpath = '//table[starts-with(@id,"aodds")]//tr[th="Opening odds"]/following-sibling::tr'
homeodd = driver.find_element_by_xpath(xpath).text
If you want to ignore date and change values:如果您想忽略日期并更改值:
xpath = '//table[starts-with(@id,"aodds")]//tr[th="Opening odds"]/following-sibling::tr/td[@class="bold"]'
An xpath such as: xpath 例如:
//tbody[contains(@id,'aodds')]/tr[4]
Seems to highlight the relevant row in the DOM:似乎突出显示了 DOM 中的相关行:
From that, you can look to find_elements and print out each child td
.从那里,您可以查看 find_elements 并打印出每个子
td
。 (shout if you need support with this) (如果你需要这方面的支持,请大喊)
As an alternative, if you already have the whole text you can just manipulate your string.作为替代方案,如果您已经拥有整个文本,则可以操作您的字符串。 Assuming
homeodd
contains the text you post:假设
homeodd
包含您发布的文本:
print (homeodd.split('OPENING ODDS')[-1])
That will split around your required text and the [-1]
takes the last item in the array - ie everything after opening odds.这将围绕您所需的文本进行拆分,并且
[-1]
采用数组中的最后一项 - 即开盘赔率之后的所有内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.