I have a very large list of URLs I'm trying to scrape , I'm iterating over every URL using a for loop.
Eventually, in x element of the list, my Chrome window will crash ('Aw Snap!' error appears on the browser window). I don't have any idea to fix this issue.
I can't share my code, but is something like this:
very_large_url_list = [url1, url2, url3, url4...]
for x in very_large_url_list:
driver.get(x)
doStuff()
If I try to close the driver on every iteration, like this:
for x in very_large_url_list:
driver.get(x)
doStuff()
driver.close()
I'd get an error stating that the session ID is invalid. If I don't close it, then a memory leakage will happen eventually and I wont be able to finish the iteration over the list. What can I do to fix this issue?
Please let me know if I haven't been clear enough so I can edit the question!
If you try to close the driver on every iteration, shouldn't you be doing this?
for x in very_large_url_list:
driver = webdriver.Chrome()
driver.get(x)
doStuff()
driver.close()
Do you know that we can open a URL without using any browser as well? It is frequently asked interview question as well. Let's learn it.
Let's perform some steps first:
window.location='https://www.redbus.in'
and hit Enter key.You will notice that redbus website is loaded.
This is the way of loading URL without using any methods like get()
or navigate()
. Above statement is called as JavaScript command. We will see JavaScript concepts later.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.