[英]Why can't Beautifulsoup find a table on this webpage
我正在嘗試使用 Google Colab 從該網站上抓取一張表格,但是當我運行下面的代碼時,我收到了空括號。
import urllib.request as url
from bs4 import BeautifulSoup
page = f'https://www.stadiumgaming.gg/rank-checker?pokemon=Walrein'
html = url.urlopen(page)
soup = BeautifulSoup(HTML,'html5lib').findAll('td')
print(soup)
Output: []
如何在此頁面上找到表格,以便將其解析為 dataframe?
您無法 Beautifulsoup 在此網頁上找到表格,因為它由JavaScript
動態填充,並且 bs4 無法解析 JS。 但你可以用 selenium 模仿 bs4、pandas
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
from selenium.webdriver.chrome.options import Options
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)
url = 'https://www.stadiumgaming.gg/rank-checker?pokemon=WALREIN'
driver.get(url)
driver.maximize_window()
time.sleep(3)
table=BeautifulSoup(driver.page_source, 'lxml')
df = pd.read_html(str(table))[1]
print(df.iloc[1:,0:9])
結果:
Rank IVs CP Lvl % Atk Def Sta Prod
1 2072 10/10/10 1483 20 94.97 114.70 111.12 150 1911.8
2 1 0/12/15 1499 21 100.00 111.41 115.09 157 2013.1
3 2 0/13/14 1500 21 99.89 111.41 115.70 156 2010.9
4 3 0/13/13 1497 21 99.89 111.41 115.70 156 2010.9
5 4 0/14/12 1498 21 99.78 111.41 116.31 155 2008.6
6 5 1/14/10 1500 21 99.68 112.02 116.31 154 2006.6
7 6 0/15/11 1499 21 99.65 111.41 116.92 154 2006.1
8 7 0/15/10 1496 21 99.65 111.41 116.92 154 2006.1
9 8 1/15/8 1498 21 99.55 112.02 116.92 153 2004
10 9 3/15/15 1499 20.5 99.53 111.89 115.52 155 2003.5
11 10 1/10/15 1499 21 99.48 112.02 113.86 157 2002.6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.