简体   繁体   中英

selenium (python) raises StaleElementReferenceException and does not continue to download all webdriver.find_elements_by_partial_link_text()

I am using selenium bindings for python to download all links on a page that contain a string "VS". The problem is that the second item in the list is not a valid web page (returns 404 error), also if I manually click on the broken link it returns:

error.html - 404 error page does not exist.

And if I run the following code it raises an error.

import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException, NoSuchWindowException

# To prevent download dialog
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2)  # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/msword, application/vnd.ms-powerpoint')

driver = webdriver.Firefox(profile)
driver.get("http://www.SOME_URL.com/")

links = driver.find_elements_by_partial_link_text("VS")

for link in links:
    url = link.get_attribute("href")
    try:
        driver.get(url)
    except StaleElementReferenceException:
       pass

The error:

Traceback (most recent call last):
  File "C:\Users\lskrinjar\Dropbox\work\preracun\src\web_data_mining\get_files_from_web.py", line 79, in <module>
    url = link.get_attribute("href")
  File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\webelement.py", line 93, in get_attribute
    resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
  File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\webelement.py", line 385, in _execute
    return self._parent.execute(command, params)
  File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\webdriver.py", line 173, in execute
    self.error_handler.check_response(response)
  File "C:\Python27\lib\site-packages\selenium-2.44.0-py2.7.egg\selenium\webdriver\remote\errorhandler.py", line 166, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element not found in the cache - perhaps the page has changed since it was looked up
Stacktrace:
    at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:8329:1)
    at Utils.getElementAt (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver@googlecode.com/components/command-processor.js:7922:10)
    at WebElement.getElementAttribute (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver@googlecode.com/components/command-processor.js:11107:31)
    at DelayedCommand.prototype.executeInternal_/h (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver@googlecode.com/components/command-processor.js:11635:16)
    at fxdriver.Timer.prototype.setTimeout/<.notify (file:///c:/users/lskrin~1/appdata/local/temp/tmppsn7tm/extensions/fxdriver@googlecode.com/components/command-processor.js:548:5)

The links list is a list of elements on the webpage. Once you navigate away from that page, the elements no longer exists (you have a list where each element points to a no-longer-existing web element). Instead of referring to the list of elements, you should use a list of URL strings:

list_of_links = []
links = driver.find_elements_by_partial_link_text("VS")

for link in links:
    list_of_links.append(link.get_attribute("href"))

for string_link in list_of_links:
    try:
        driver.get(string_link)
    except StaleElementReferenceException:
       pass

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM