I am working on a project in python and it is my first time using bs4 and requests. I have a page that loads, but all the information is added after by js. Using bs4 and requests I can't seems to get the data added by js, how would I do it?
import requests
from bs4 import BeautifulSoup
page = "https://covid-19-newfoundland-and-labrador-gnl.hub.arcgis.com"
result = requests.get(page)
source = result.text
soup = BeautifulSoup(source, 'html.parser')
if soup.head.parent.name == 'html':
print(soup.title)
tmpBody = soup.body
# print(soup)
div1 = soup.find(id="ember63")
print(soup.find_all('section'))
print(div1)
else:
print("not html")
I found some code similar to this on stackoverflow but it says chromedriver
executable needs to be on path, and I am not sure what chromedriver
is.
from bs4 import BeautifulSoup
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
driver = webdriver.Chrome(chrome_options=chrome_options)
url = "https://covid-19-newfoundland-and-labrador-gnl.hub.arcgis.com"
driver.get(url)
page = driver.page_source
soup = BeautifulSoup(page, 'html.parser')
if soup.head.parent.name == 'html':
print(soup.title)
tmpBody = soup.body
div1 = soup.find(id="ember63")
print(soup.find_all('section'))
print(div1)
else:
print("not html")
Without knowing exact what you want i guess this is near:
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
driver = webdriver.Chrome(chrome_options=chrome_options)
url = "https://covid-19-newfoundland-and-labrador-gnl.hub.arcgis.com"
driver.get(url)
driver.implicitly_wait(15)
div1 = driver.find_element_by_css_selector('#ember63').text.strip()
print(*[section.text.strip() for section in driver.find_elements_by_css_selector('section')])
print(div1)
driver.close()
prints:
Covid-19 Home Pandemic Update
Total # of Cases
......
New Cases
Last new case: July 26, 2020
Total Recovered
......
Currently Hospitalized
.......
Currently in ICU
......
Total Deaths
...... Active Cases
......
Total # of People Tested
......
And so on..
This also do it al in the same libe without need of BeautifulSoup. Why selenium not working for you is answered in the comments
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.