繁体   English   中英

为什么Beautifulsoup在这个网页上找不到表

[英]Why can't Beautifulsoup find a table on this webpage

我正在尝试使用 Google Colab 从该网站上抓取一张表格,但是当我运行下面的代码时,我收到了空括号。

import urllib.request as url
from bs4 import BeautifulSoup

page = f'https://www.stadiumgaming.gg/rank-checker?pokemon=Walrein'
html = url.urlopen(page)
soup = BeautifulSoup(HTML,'html5lib').findAll('td')
print(soup)

Output: []

如何在此页面上找到表格,以便将其解析为 dataframe?

您无法 Beautifulsoup 在此网页上找到表格,因为它由JavaScript动态填充,并且 bs4 无法解析 JS。 但你可以用 selenium 模仿 bs4、pandas

import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
from selenium.webdriver.chrome.options import Options

webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)
url = 'https://www.stadiumgaming.gg/rank-checker?pokemon=WALREIN'
driver.get(url)
driver.maximize_window()
time.sleep(3)

table=BeautifulSoup(driver.page_source, 'lxml')

df = pd.read_html(str(table))[1]
print(df.iloc[1:,0:9])

结果:

    Rank       IVs    CP   Lvl       %     Atk     Def  Sta    Prod
1   2072  10/10/10  1483    20   94.97  114.70  111.12  150  1911.8
2      1   0/12/15  1499    21  100.00  111.41  115.09  157  2013.1
3      2   0/13/14  1500    21   99.89  111.41  115.70  156  2010.9
4      3   0/13/13  1497    21   99.89  111.41  115.70  156  2010.9
5      4   0/14/12  1498    21   99.78  111.41  116.31  155  2008.6
6      5   1/14/10  1500    21   99.68  112.02  116.31  154  2006.6
7      6   0/15/11  1499    21   99.65  111.41  116.92  154  2006.1
8      7   0/15/10  1496    21   99.65  111.41  116.92  154  2006.1
9      8    1/15/8  1498    21   99.55  112.02  116.92  153    2004
10     9   3/15/15  1499  20.5   99.53  111.89  115.52  155  2003.5
11    10   1/10/15  1499    21   99.48  112.02  113.86  157  2002.6

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM