如何过滤从 HTML 表中提取的值

Question

I want to scrape stats and odds from betexplorer, in particular this site https://www.betexplorer.com/soccer/russia/premier-league-2019-2020/cska-moscow-fc-tambov/8Ya3mpOC/我想从 betexplorer 获取统计数据和赔率，特别是这个网站https://www.betexplorer.com/soccer/russia/premier-league-2019-2020/cska-moscow-fc-tambov/8Ya3mpOC/

homeodd = driver.find_element_by_xpath("//*[contains(@id,'aodds')]").text

I have this return我有这个回报

CLOSING ODDS
22.07. 17:48 1.28 +0.01
OPENING ODDS
17.07. 00:07 1.27

How could I scrape only the opening odds?我怎么能只刮开开盘赔率？

Answer 1

Try more specific XPath to get Opening Odds only:尝试更具体的 XPath 以获得仅开盘赔率：

xpath = '//table[starts-with(@id,"aodds")]//tr[th="Opening odds"]/following-sibling::tr'
homeodd = driver.find_element_by_xpath(xpath).text

If you want to ignore date and change values:如果您想忽略日期并更改值：

xpath = '//table[starts-with(@id,"aodds")]//tr[th="Opening odds"]/following-sibling::tr/td[@class="bold"]'

Answer 2

An xpath such as: xpath 例如：

//tbody[contains(@id,'aodds')]/tr[4]

Seems to highlight the relevant row in the DOM:似乎突出显示了 DOM 中的相关行：

From that, you can look to find_elements and print out each child td .从那里，您可以查看 find_elements 并打印出每个子td 。 (shout if you need support with this) （如果你需要这方面的支持，请大喊）

As an alternative, if you already have the whole text you can just manipulate your string.作为替代方案，如果您已经拥有整个文本，则可以操作您的字符串。 Assuming homeodd contains the text you post:假设homeodd包含您发布的文本：

print (homeodd.split('OPENING ODDS')[-1])

That will split around your required text and the [-1] takes the last item in the array - ie everything after opening odds.这将围绕您所需的文本进行拆分，并且[-1]采用数组中的最后一项 - 即开盘赔率之后的所有内容。

如何过滤从 HTML 表中提取的值

问题描述

2 个解决方案

解决方案1
2 2020-08-05 12:44:15

解决方案2
0 已采纳 2020-08-05 12:42:58

如何过滤从 HTML 表中提取的值

问题描述

2 个解决方案

解决方案1 2 2020-08-05 12:44:15

解决方案2 0 已采纳 2020-08-05 12:42:58

解决方案1
2 2020-08-05 12:44:15

解决方案2
0 已采纳 2020-08-05 12:42:58