A problem with web scraping using python ,BeautifulSoup and pandas 'read_html'

Question

Thank you for the helpers !

I am scraping a table of data about covid19 and push it into a pandas data frame, it was working until this morning.

That the code:

import pandas as pd
import requests
from bs4 import BeautifulSoup


url = 'https://www.worldometers.info/coronavirus/'

req = requests.get(url)

page = BeautifulSoup(req.content, 'html.parser')

table = page.find_all('table',id="main_table_countries_today")[0]

print(table)

df = pd.read_html(str(table))[0]

This morning I starting to get the next error:

ValueError: No tables found matching pattern '.+'

Can you please help me figure it out?

Answer 1

Try changing the last line to: df = pd.read_html(str(table), displayed_only=False)[0] The table header at the url has changed its style attribute to style="width:100%;margin-top: 0px;important:display;none.". Previously it did not have the 'display' tag set.

A problem with web scraping using python ,BeautifulSoup and pandas 'read_html'

Question

1 answers

solution1
3 ACCPTED 2020-05-29 14:47:13

A problem with web scraping using python ,BeautifulSoup and pandas 'read_html'

Question

1 answers

solution1 3 ACCPTED 2020-05-29 14:47:13

solution1
3 ACCPTED 2020-05-29 14:47:13