Web Scraping the Registration Reset Website

Question

I am trying to get some perspective on web scraping this website. Essentially, what I am going to do is use the header keys as a way to scrape the data from the website and create a list of tuples, which I will convert into a data frame.

The issue is navigating to display different results and using a for loop to do so (example navigating from the first 50 results to the next 50 results.

What attribute, class, etc would I need to access so that I can iterate from tab to tab till the maximum number of rows is reached?

https://www6.sos.state.oh.us/ords/f?p=119:REGRESET:0 :

Answer 1

What happens is what classes are shown in the inspect element and real classes are different sometimes. Try to write the page as a binary file like:

import requests
html = requests.request("GET","https://www6.sos.state.oh.us/ords/f?p=119:REGRESET:0"
f = open("file.html", "w+")
f.write(str(html))
f.close()

Open the file in a browser and then inspect it, you will get the correct classes to scrape.

Web Scraping the Registration Reset Website

Question

1 answers

solution1
0 2019-09-04 14:00:10

Web Scraping the Registration Reset Website

Question

1 answers

solution1 0 2019-09-04 14:00:10

solution1
0 2019-09-04 14:00:10