[英]Scraping multiple tables with BeautifulSoup
How can I get number goals from 'Goal times' table from this url https://www.soccerstats.com/pmatch.asp?league=argentina3&stats=114-3-8-2022-almagro-d.-de-belgrano ?我怎样才能从这个 url https://www.soccerstats.com/pmatch.asp?league=argentina3&stats=114-3-8-2022-almagro-d.-de-belgrano 的“目标时间”表中获得数字目标?
PS: The main page is https://www.soccerstats.com/matches.asp?matchday=1 PS:主页面是https://www.soccerstats.com/matches.asp?matchday=1
I am able to find the table but when I try to get the stats nothing change我能够找到表格但是当我尝试获取统计信息时没有任何变化
Code代码
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36'}
s = requests.Session()
s.headers.update(headers)
response = requests.get('https://www.soccerstats.com/pmatch.asp?league=argentina3&stats=114-3-8-2022-almagro-d.-de-belgrano', headers=headers)
if response.status_code == 200:
soup = BeautifulSoup(response.text, 'html.parser')
else:
pass
for ta in soup.findAll('table'):
for s in ta.findPreviousSiblings():
if s.name == 'h2':
if s.text == 'Goal times':
goal_scoring_stats_table = ta
else:
break
for ta in goal_scoring_stats_table.findAll('table'):
for s in ta.findPreviuosSiblings():
if s.name == 'b':
if s.text == 'Home':
print(ta)
You could use pandas
to get all the tables and then fish out the one you're after.您可以使用
pandas
获取所有表,然后找出您想要的表。 Finally, massage the table to your liking.最后,根据自己的喜好按摩桌子。
For example:例如:
import pandas as pd
import requests
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:98.0) Gecko/20100101 Firefox/98.0",
}
url = "https://www.soccerstats.com/pmatch.asp?league=argentina3&stats=114-3-8-2022-almagro-d.-de-belgrano"
df = pd.read_html(requests.get(url, headers=headers).text, flavor="lxml")[106]
print(df)
Output: Output:
0 1 2 3 4
0 0-15 GF 0.0 NaN NaN
1 0-15 GA 0.0 NaN NaN
2 16-30 GF 3.0 NaN NaN
3 16-30 GA 1.0 NaN NaN
4 31-45 GF 1.0 NaN NaN
5 31-45 GA 0.0 NaN NaN
6 NaN NaN NaN NaN NaN
7 46-60 GF 0.0 NaN NaN
8 46-60 GA 0.0 NaN NaN
9 61-75 GF 0.0 NaN NaN
10 61-75 GA 0.0 NaN NaN
11 76-90 GF 0.0 NaN NaN
12 76-90 GA 1.0 NaN NaN
13 NaN NaN NaN NaN NaN
14 1st half GF 4.0 NaN 100%
15 1st half GA 1.0 NaN 50%
16 NaN NaN NaN NaN NaN
17 2nd half GF 0.0 NaN 0%
18 2nd half GA 1.0 NaN 50%
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.