Python selenium：遇到 StaleElementReferenceException

Question

我正在尝试从 Glassdoor 中抓取过去 24 小时内的所有招聘信息，并将它们保存到字典中。

binary = FirefoxBinary('path_to_firebox_binary.exe')
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = True
driver = webdriver.Firefox(firefox_binary=binary, capabilities=cap, executable_path=GeckoDriverManager().install())

base_url = 'https://www.glassdoor.com/Job/jobs.htm?suggestCount=0&suggestChosen=false&clickSource=searchBtn' \
       '&typedKeyword=data+sc&sc.keyword=data+scientist&locT=C&locId=1154532&jobType= '
driver.get(url=base_url)
driver.implicitly_wait(20)
driver.maximize_window()
WebDriverWait(driver, 20).until(
        EC.element_to_be_clickable((By.CSS_SELECTOR, "div#filter_fromAge>span"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((
        By.XPATH, "//div[@id='PrimaryDropdown']/ul//li//span[@class='label' and contains(., 'Last Day')]"))).click()

# find job listing elements on web page
listings = driver.find_elements_by_class_name("jl")
n_listings = len(listings)

results = {}

for index in range(n_listings):
    driver.find_elements_by_class_name("jl")[index].click()  # runs into error
    print("clicked listing {}".format(index + 1))
    info = driver.find_element_by_class_name("empInfo.newDetails")
    emp = info.find_element_by_class_name("employerName")

results[index] = {'title': title, 'company': emp_name, 'description': description}

我不断遇到错误消息

selenium.common.exceptions.StaleElementReferenceException：消息：元素引用已过时； 元素不再附加到 DOM，它不在当前框架上下文中，或者文档已被刷新

对于我的for循环中的第一行。 即使 for 循环运行了一定次数，它最终也会导致出现异常。 我是硒和网络抓取的新手，将感谢任何帮助。

Answer 1

每次选择一个新帖子时，点击的元素都会被修改，因此 DOM 会被刷新。 与循环中的动作相比，变化是缓慢的，所以你要做的就是让它慢一点。 您可以等待更改发生，而不是使用固定睡眠

每次您选择一个帖子时，都会添加一个selected的新类，并且style属性会丢失它的内容。 您应该等待这种情况发生，获取信息，然后单击下一篇文章

wait = WebDriverWait(driver, 20)
for index in range(n_listings - 1):
    wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.selected:not([style="border-bottom:0"])')))
    print("clicked listing {}".format(index + 1))
    info = driver.find_element_by_class_name('empInfo.newDetails')
    emp = info.find_element_by_class_name('employerName')
    if index < n_listings - 1:
        driver.find_element_by_css_selector('.selected + .jl').click()

Answer 2

此错误意味着未找到您尝试单击的元素，您必须首先确保目标元素存在，然后调用click()或将其包装在 try/except 块中。

# ...
results = {}

for index in range(n_listings):
    try:
        driver.find_elements_by_class_name("jl")[index].click()  # runs into error
    except:
        print('Listing not found, retrying in 1 seconds ...')
        time.sleep(1)
        continue
    print("clicked listing {}".format(index + 1))
    info = driver.find_element_by_class_name("empInfo.newDetails")
    emp = info.find_element_by_class_name("employerName")
# ...

Python selenium：遇到 StaleElementReferenceException

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-11-03 23:29:12

解决方案2
1 2020-11-02 17:18:24

Python selenium：遇到 StaleElementReferenceException

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-11-03 23:29:12

解决方案2 1 2020-11-02 17:18:24

解决方案1
2 已采纳 2020-11-03 23:29:12

解决方案2
1 2020-11-02 17:18:24