I am trying to scrape a web-page to collect a list of Fortune 500 companies. However, when I run this code, BeautifulSoup can't find <div class="rt-tr-group" role="rowgroup">
tags.
import requests
from bs4 import BeautifulSoup
url = r'https://fortune.com/fortune500/2019/search/'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'lxml')
data = soup.find_all('div', {'class': 'rt-tr-group'})
Instead, I just get an empty list. I've tried changing the parser but saw no results.
The tags exist and can be seen here :
Data is loading on that page using JS, after some time. Using Selenium, you can wait for page to be loaded completely, or try to get data from Javascript.
PS You can check for XHR requests and try to get JSON instead, without parsing. Here is one request
Content of your parsing page loading with JS, and you can get empty page with requests.get
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.