簡體   English   中英

使用 Beautifulsoup web 刮擦的缺失表

[英]Missing Tables with Beautifulsoup web scraping

我一直在嘗試在 webscrapevolved-hockey.com 網站上獲取團隊數據,並且只能閱讀

使用:

from bs4 import BeautifulSoup as bs
from bs4 import Comment
import requests
    
site = 'https://evolving-hockey.com/stats/team_standard/?_inputs_&std_tm_str=%225v5%22&std_tm_table=%22On-Ice%22&std_tm_team=%22All%22&std_tm_range=%22Seasons%22&std_tm_adj=%22Score%20%26%20Venue%22&std_tm_span=%22Regular%22&dir_ttbl=%22Stats%22&std_tm_type=%22Rates%22&std_tm_group=%22Season%22'

r = requests.get(site)
soup = bs(r.content, 'html.parser')
data = soup.find_all('table')

即使 html 代碼表明其中有表格,也不返回任何內容。

為什么beautifulsoup找不到表數據? 它們與其他地方有聯系嗎?

謝謝您的幫助

為了檢索動態加載的數據,我使用了 selenium

from bs4 import BeautifulSoup as bs
from selenium import webdriver

driver = webdriver.Chrome()
site = 'https://evolving-hockey.com/stats/team_standard/?_inputs_&std_tm_str=%225v5%22&std_tm_table=%22On-Ice%22&std_tm_team=%22All%22&std_tm_range=%22Seasons%22&std_tm_adj=%22Score%20%26%20Venue%22&std_tm_span=%22Regular%22&dir_ttbl=%22Stats%22&std_tm_type=%22Rates%22&std_tm_group=%22Season%22'

driver.get(site)
import time
time.sleep(5) # delay 
soup = bs(driver.page_source, 'html.parser') 
data = soup.find_all('tr')[1]
for d in data:
    print(d.get_text(strip=True), end='   ')

data2 = soup.find_all('tr')[1:33] 

for x in data2:
    print(x.get_text(strip=True,separator='  '), end='\n')

driver.quit()

打印

   Name   Team   Season   GP   TOI   GF%   SF%   FF%   CF%   xGF%   GF/60   GA/60   SF/60   SA/60   FF/60   FA/60   CF/60   CA/60   xGF/60   xGA/60   G±/60   S±/60   F±/60   C±/60   xG±/60   Sh%   Sv%   Name  Team  Season  GP  TOI  GF%  SF%  FF%  CF%  xGF%  GF/60  GA/60  SF/60  SA/60  FF/60  FA/60  CF/60  CA/60  xGF/60  xGA/60  G±/60  S±/60  F±/60  C±/60  xG±/60  Sh%  Sv%
1  Ducks  ANA  19-20  71  3450.65  46.57  47.7  47.97  47.87  47.08  2.22  2.55  28.77  31.54  41.45  44.96  53.96  58.75  2.32  2.61  -0.33  -2.77  -3.51  -4.79  -0.29  7.73  91.91
2  Coyotes  ARI  19-20  70  3405.93  50.08  49.72  48.57  48.6  49.61  2.24  2.24  31.12  31.47  42.56  45.07  56.02  59.24  2.33  2.36  0.01  -0.35  -2.51  -3.22  -0.04  7.21  92.9
3  Bruins  BOS  19-20  70  3328.28  57.69  52.48  51.83  51.93  52.82  2.56  1.88  31.07  28.13  42.5  39.5  55.98  51.81  2.22  1.98  0.68  2.94  3  4.17  0.24  8.24  93.32
4  Sabres  BUF  19-20  69  3393.5  49.11  47.9  48.37  48.81  47.54  2.29  2.37  27.93  30.38  38.75  41.36  50.16  52.62  2.05  2.26  -0.08  -2.45  -2.61  -2.45  -0.21  8.2  92.19
5  Hurricanes  CAR  19-20  68  3217.15  50.97  52.96  53.66  54.42  52.37  2.63  2.53  32.38  28.75  45.33  39.15  60.05  50.29  2.76  2.51  0.1  3.62  6.18  9.76  0.25  8.12  91.2
6  Blue Jackets  CBJ  19-20  70  3478.85  50.55  51.82  50.66  48.87  51.66  2.14  2.09  31.53  29.32  41.53  40.44  53.63  56.11  2.22  2.08  0.05  2.21  1.09  -2.48  0.14  6.78  92.87
7  Flames  CGY  19-20  70  3429.05  47.34  48.91  49.42  49.91  50.74  2.31  2.57  30.24  31.58  43.03  44.05  57.22  57.41  2.47  2.4  -0.26  -1.35  -1.02  -0.2  0.07  7.64  91.86

etc.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM