為什么Beautifulsoup在這個網頁上找不到表

Question

我正在嘗試使用 Google Colab 從該網站上抓取一張表格，但是當我運行下面的代碼時，我收到了空括號。

import urllib.request as url
from bs4 import BeautifulSoup

page = f'https://www.stadiumgaming.gg/rank-checker?pokemon=Walrein'
html = url.urlopen(page)
soup = BeautifulSoup(HTML,'html5lib').findAll('td')
print(soup)

Output： []

如何在此頁面上找到表格，以便將其解析為 dataframe？

Answer 1

您無法 Beautifulsoup 在此網頁上找到表格，因為它由JavaScript動態填充，並且 bs4 無法解析 JS。 但你可以用 selenium 模仿 bs4、pandas

import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
from selenium.webdriver.chrome.options import Options

webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)
url = 'https://www.stadiumgaming.gg/rank-checker?pokemon=WALREIN'
driver.get(url)
driver.maximize_window()
time.sleep(3)

table=BeautifulSoup(driver.page_source, 'lxml')

df = pd.read_html(str(table))[1]
print(df.iloc[1:,0:9])

結果：

    Rank       IVs    CP   Lvl       %     Atk     Def  Sta    Prod
1   2072  10/10/10  1483    20   94.97  114.70  111.12  150  1911.8
2      1   0/12/15  1499    21  100.00  111.41  115.09  157  2013.1
3      2   0/13/14  1500    21   99.89  111.41  115.70  156  2010.9
4      3   0/13/13  1497    21   99.89  111.41  115.70  156  2010.9
5      4   0/14/12  1498    21   99.78  111.41  116.31  155  2008.6
6      5   1/14/10  1500    21   99.68  112.02  116.31  154  2006.6
7      6   0/15/11  1499    21   99.65  111.41  116.92  154  2006.1
8      7   0/15/10  1496    21   99.65  111.41  116.92  154  2006.1
9      8    1/15/8  1498    21   99.55  112.02  116.92  153    2004
10     9   3/15/15  1499  20.5   99.53  111.89  115.52  155  2003.5
11    10   1/10/15  1499    21   99.48  112.02  113.86  157  2002.6

為什么Beautifulsoup在這個網頁上找不到表

問題描述

1 個解決方案

解決方案1
0 2022-08-21 20:56:53

為什么Beautifulsoup在這個網頁上找不到表

問題描述

1 個解決方案

解決方案1 0 2022-08-21 20:56:53

解決方案1
0 2022-08-21 20:56:53