在迭代期间无法获取所有必要的链接-Selenium Python

Question

I am newbie to Selenium Python. 我是Selenium Python的新手。 I am trying to fetch the profile URLs which will be 10 per page. 我正在尝试获取个人资料网址（每页10个）。 Without using while , I am able to fetch all 10 URLs but for only the first page alone. 无需使用while ，我就能获取所有10个URL，但仅用于首页。 When I use while , it iterates, but fetches only 3 or 4 URLs per page. 当我使用while ，它会进行迭代，但每页仅获取3或4个URL。

I need to fetch all the 10 links and keep iterating through pages. 我需要获取所有10个链接，并不断浏览页面。 I think, I must do something with StaleElementReferenceException 我认为，我必须对StaleElementReferenceException做些事情

Kindly help me solve this problem. 请帮助我解决这个问题。

Given the code below. 给定下面的代码。

def test_connect_fetch_profiles(self):
    driver = self.driver
    search_data = driver.find_element_by_id("main-search-box")
    search_data.clear()
    search_data.send_keys("Selenium Python")
    search_submit = driver.find_element_by_name("search")
    search_submit.click()
    noprofile = driver.find_elements_by_xpath("//*[text() = 'Sorry, no results containing all your search terms were found.']")
    self.assertFalse(noprofile)
    while True:
        wait = WebDriverWait(driver, 150)
        try:
            profile_links = wait.until(EC.presence_of_all_elements_located((By.XPATH,"//*[contains(@href,'www.linkedin.com/profile/view?id=')][text()='LinkedIn Member'or contains(@href,'Type=NAME_SEARCH')][contains(@class,'main-headline')]")))
            for each_link in profile_links:
                page_links = each_link.get_attribute('href')
                print(page_links)
                driver.implicitly_wait(15)
                appendFile = open("C:\\Users\\jayaramb\\Documents\\profile-links.csv", 'a')
                appendFile.write(page_links + "\n")
                appendFile.close()
                driver.implicitly_wait(15)
                next = wait.until(EC.visibility_of(driver.find_element_by_partial_link_text("Next")))
                if next.is_displayed():
                    next.click()
                else:
                    print("End of Page")
                    break
        except ValueError:
            print("It seems no values to fetch")
        except NoSuchElementException:
            print("No Elements to Fetch")
        except StaleElementReferenceException:
             print("No Change in Element Location")
        else:
                break

Please let me know if there are any other effective ways to fetch the required profile URL and keep iterating through pages. 请让我知道是否还有其他有效的方法来获取所需的配置文件URL并不断浏览页面。

Answer 1

I created a similar setup which works alright for me. 我创建了一个类似的设置，对我来说很好。 I've had some problems with selenium trying to click on the next-button but it throwing a WebDriverException instead, likely because the next-button is not in view. 硒试图单击下一步按钮时遇到了一些问题，但是它抛出了WebDriverException异常，这可能是因为看不见下一步按钮。 Hence, instead of clicking the next-button I get its href-attribute and load the new page up with driver.get() and thus avoiding an actual click making the test more stable. 因此，我没有单击下一步按钮，而是获取其href属性，并使用driver.get（）加载了新页面，从而避免了实际单击，从而使测试更加稳定。

def test_fetch_google_links():

    links = []

    # Setup driver
    driver = webdriver.Firefox()
    driver.implicitly_wait(10)
    driver.maximize_window()

    # Visit google
    driver.get("https://www.google.com")

    # Enter search query
    search_data = driver.find_element_by_name("q")
    search_data.send_keys("test")

    # Submit search query
    search_button = driver.find_element_by_xpath("//button[@type='submit']")
    search_button.click()

    while True:
        # Find and collect all anchors
        anchors = driver.find_elements_by_xpath("//h3//a")
        links += [a.get_attribute("href") for a in anchors]

        try:
            # Find the next page button
            next_button = driver.find_element_by_xpath("//a[@id='pnnext']")
            location = next_button.get_attribute("href")
            driver.get(location)

        except NoSuchElementException:
            break

    # Do something with the links
    for l in links:
        print l

    print "Found {} links".format(len(links))

    driver.quit()

在迭代期间无法获取所有必要的链接-Selenium Python

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-03-25 20:07:30

在迭代期间无法获取所有必要的链接-Selenium Python

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-03-25 20:07:30

解决方案1
0 已采纳 2016-03-25 20:07:30