Missing HTML Elements when scraping website. Python

Question

I am trying to extract an HREF from a website using bs4 and Selenium. However, when I use Beautiful Soup to parse the HTML, the elements I'm looking for go missing. When I try searching for them later, I just get NoneType Objects. Here is what I'd like to take out:

I am using the following code to parse quickly:

my_url = browser.current_url
uClient = uReq(my_url) 
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")

But when I run:

squeeps = page_soup.findAll("div",{'id':'pcisBody'})
squeeps[0]

This is all I get:

<div id="pcisBody">
<img alt="loading" height="40" src="/OnlineServices/Images/loading.gif" width="40"/>
<span id="pcisLoading">Retrieving Data...</span>
</div>

Any help would be greatly appreciated!! Here is the link: https://www.ladbsservices2.lacity.org/OnlineServices/PermitReport/PermitResults/444952

Answer 1

BeautifulSoup doesn't capture the data of a website after it's initial load. As a workaround, you can use selenium and visit the website. Then, wait till certain minutes or a certain load event is triggered and then get the page source. Then, pass it to BeautifulSoup.

Missing HTML Elements when scraping website. Python

Question

1 answers

solution1
0 ACCPTED 2020-11-16 05:04:44

Missing HTML Elements when scraping website. Python

Question

1 answers

solution1 0 ACCPTED 2020-11-16 05:04:44

solution1
0 ACCPTED 2020-11-16 05:04:44