使用 python、BeautifulSoup 和 pandas 'read_html' 進行 web 抓取的問題

Question

謝謝各位幫手！

我正在抓取有關 covid19 的數據表並將其推送到 pandas 數據框中，它一直工作到今天早上。

即代碼：

import pandas as pd
import requests
from bs4 import BeautifulSoup


url = 'https://www.worldometers.info/coronavirus/'

req = requests.get(url)

page = BeautifulSoup(req.content, 'html.parser')

table = page.find_all('table',id="main_table_countries_today")[0]

print(table)

df = pd.read_html(str(table))[0]

今天早上我開始遇到下一個錯誤：

ValueError: No tables found matching pattern '.+'

你能幫我弄清楚嗎？

Answer 1

嘗試將最后一行更改為： df = pd.read_html(str(table), displayed_only=False)[0]表 header at the url has changed its style attribute to style="width:100%;margin-top: 0px ；重要：顯示；無。”。 以前它沒有設置“顯示”標簽。

使用 python、BeautifulSoup 和 pandas 'read_html' 進行 web 抓取的問題

問題描述

1 個解決方案

解決方案1
3 已采納 2020-05-29 14:47:13

使用 python、BeautifulSoup 和 pandas 'read_html' 進行 web 抓取的問題

問題描述

1 個解決方案

解決方案1 3 已采納 2020-05-29 14:47:13

解決方案1
3 已采納 2020-05-29 14:47:13