简体   繁体   中英

A problem with web scraping using python ,BeautifulSoup and pandas 'read_html'

Thank you for the helpers !

I am scraping a table of data about covid19 and push it into a pandas data frame, it was working until this morning.

That the code:

import pandas as pd
import requests
from bs4 import BeautifulSoup


url = 'https://www.worldometers.info/coronavirus/'

req = requests.get(url)

page = BeautifulSoup(req.content, 'html.parser')

table = page.find_all('table',id="main_table_countries_today")[0]

print(table)

df = pd.read_html(str(table))[0]

This morning I starting to get the next error:

ValueError: No tables found matching pattern '.+'

Can you please help me figure it out?

Try changing the last line to: df = pd.read_html(str(table), displayed_only=False)[0] The table header at the url has changed its style attribute to style="width:100%;margin-top: 0px;important:display;none.". Previously it did not have the 'display' tag set.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM