I'm running a script that works well to scrape some data I need. The script crawl some existing URLs on a given web page and visit each URL to get the final URL. The problem occurs when the final URL is not found " This site can't be reached ". The code crashes and I get this in the log:
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from unknown error: cannot determine loading status
from tab crashed
(Session info: chrome=84.0.4147.135)
(Driver info: chromedriver=2.43.600210 (68dcf5eebde37173d4027fa8635e332711d2874a),platform=Windows NT 6.1.7601 SP1 x86_64)
Here is the code I use to scrape the final URLs:
#Open link (opens in new tab)
elem = driver.find_element_by_xpath('//*[@id="popup__teaser"]/div[6]/div/div/a')
elem.click()
time.sleep(2)
#wait for redirection to load - switch to the new tab - grab and print the new URL
driver.get(driver.current_url)
time.sleep(1)
driver.switch_to_window(driver.window_handles[1])
URL= driver.current_url
#Close active tab
driver.close()
#switch to main tab
driver.switch_to_window(driver.window_handles[0])
Can anybody help with this issue? It only happens when the redirection URL is not found. Thanks
EDIT: I've tried adding chrome_options.add_argument('--disable-dev-shm-usage')
but it didn't work.
Try importing requests and check the status code of the site. For a site to be active, it should normally have a status code of 200. If it does not have a status code of 200 then chances are it cannot be reached
import requests
if requests.get(url).status_code!=200:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.