在迭代期间无法获取所有必要的链接-Selenium Python

Question

我是Selenium Python的新手。 我正在尝试获取个人资料网址（每页10个）。 无需使用while ，我就能获取所有10个URL，但仅用于首页。 当我使用while ，它会进行迭代，但每页仅获取3或4个URL。

我需要获取所有10个链接，并不断浏览页面。 我认为，我必须对StaleElementReferenceException做些事情

请帮助我解决这个问题。

给定下面的代码。

def test_connect_fetch_profiles(self):
    driver = self.driver
    search_data = driver.find_element_by_id("main-search-box")
    search_data.clear()
    search_data.send_keys("Selenium Python")
    search_submit = driver.find_element_by_name("search")
    search_submit.click()
    noprofile = driver.find_elements_by_xpath("//*[text() = 'Sorry, no results containing all your search terms were found.']")
    self.assertFalse(noprofile)
    while True:
        wait = WebDriverWait(driver, 150)
        try:
            profile_links = wait.until(EC.presence_of_all_elements_located((By.XPATH,"//*[contains(@href,'www.linkedin.com/profile/view?id=')][text()='LinkedIn Member'or contains(@href,'Type=NAME_SEARCH')][contains(@class,'main-headline')]")))
            for each_link in profile_links:
                page_links = each_link.get_attribute('href')
                print(page_links)
                driver.implicitly_wait(15)
                appendFile = open("C:\\Users\\jayaramb\\Documents\\profile-links.csv", 'a')
                appendFile.write(page_links + "\n")
                appendFile.close()
                driver.implicitly_wait(15)
                next = wait.until(EC.visibility_of(driver.find_element_by_partial_link_text("Next")))
                if next.is_displayed():
                    next.click()
                else:
                    print("End of Page")
                    break
        except ValueError:
            print("It seems no values to fetch")
        except NoSuchElementException:
            print("No Elements to Fetch")
        except StaleElementReferenceException:
             print("No Change in Element Location")
        else:
                break

请让我知道是否还有其他有效的方法来获取所需的配置文件URL并不断浏览页面。

Answer 1

我创建了一个类似的设置，对我来说很好。 硒试图单击下一步按钮时遇到了一些问题，但是它抛出了WebDriverException异常，这可能是因为看不见下一步按钮。 因此，我没有单击下一步按钮，而是获取其href属性，并使用driver.get（）加载了新页面，从而避免了实际单击，从而使测试更加稳定。

def test_fetch_google_links():

    links = []

    # Setup driver
    driver = webdriver.Firefox()
    driver.implicitly_wait(10)
    driver.maximize_window()

    # Visit google
    driver.get("https://www.google.com")

    # Enter search query
    search_data = driver.find_element_by_name("q")
    search_data.send_keys("test")

    # Submit search query
    search_button = driver.find_element_by_xpath("//button[@type='submit']")
    search_button.click()

    while True:
        # Find and collect all anchors
        anchors = driver.find_elements_by_xpath("//h3//a")
        links += [a.get_attribute("href") for a in anchors]

        try:
            # Find the next page button
            next_button = driver.find_element_by_xpath("//a[@id='pnnext']")
            location = next_button.get_attribute("href")
            driver.get(location)

        except NoSuchElementException:
            break

    # Do something with the links
    for l in links:
        print l

    print "Found {} links".format(len(links))

    driver.quit()

在迭代期间无法获取所有必要的链接-Selenium Python

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-03-25 20:07:30

在迭代期间无法获取所有必要的链接-Selenium Python

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-03-25 20:07:30

解决方案1
0 已采纳 2016-03-25 20:07:30