简体   繁体   English

硒进入Python的无限循环

[英]Selenium goes into infinite loop in Python

I am trying to scrape a website and fairly new to Python. 我正在尝试抓取一个网站,对Python来说还很陌生。 I have managed to come up with below code. 我设法拿出下面的代码。 The problem however is it goes into an infinite loop after reaching the last page ie Next button is greyed out. 但是,问题在于到达最后一页后进入无限循环,即“下一步”按钮变灰。 Also i don't think i am catching the Stale Element properly here. 我也不认为我在这里正确地捕获了Stale元素。 Any help would be greatly appreciated!` 任何帮助将不胜感激!

pages_remaining = True

while pages_remaining:
    button=driver.find_element_by_class_name("arrow-right")
    href_data = button.get_attribute('href')
    if href_data is not None:
        soup=BeautifulSoup(driver.page_source,"html.parser")
        data = soup.find_all("div",{"class":"shelfProductStamp-content row"})
        count = 1
    for item in data:
            ProductText=item.find("a",attrs={"class":"shelfProductStamp-imageLink"})["title"]    
            if item.find("span",attrs={"class":"sf-pricedisplay"}) is not None:
                Price=item.find("span",attrs={"class":"sf-pricedisplay"}).text
            else:
                Price=""
            if item.find("p",attrs={"class":"sf-comparativeText"}) is not None:
                SubPrice1=item.find("p",attrs={"class":"sf-comparativeText"}).text
            else:
                SubPrice1=""
            if item.find("span",attrs={"class":"sf-regoption"}) is not None:
                Option=item.find("span",attrs={"class":"sf-regoption"}).text
            else:
                Option=""           
            SubPrice=str(SubPrice1)+"-"+str(Option)
            SaleDates=item.find("div",attrs={"class":"sale-dates"}).text
            urll2=driver.current_url
            PageNo=driver.find_element_by_class_name("current").text
            writer.writerow([ProductText,Price,SubPrice,SaleDates,PageNo])
            count+=1
    try:
        def find(driver):
            element = driver.find_element_by_class_name("arrow-right")
            if element:
                return element
            else:
                pages_remaining=False
                #driver.quit()
        time.sleep(10)
        driver.implicitly_wait(10)
        element = WebDriverWait(driver, 60).until(find)
        driver.execute_script("arguments[0].click();", element)
    except StaleElementReferenceException:
        pass
    else:
        break

Thanks 谢谢

When you set pages_remaining = False inside the find() function, that is a local variable. find()函数中设置pages_remaining = False ,这是一个局部变量。 It is not the same variable as pages_remaining in the outer loop. 它与外部循环中的pages_remaining

If you want to do it that way, you'll need to make it a global. 如果您想这样做,则需要将其设置为全局。

Thanks for your help here. 多谢您的协助。 I managed to fix this by simply adding another if statement at the end and swapping the time.sleep(10) as below 我设法通过在末尾添加另一个if语句并交换time.sleep(10)来解决此问题,如下所示

try:
    def find(driver):
        element = driver.find_element_by_class_name("arrow-right")
        if element:
            return element                
    driver.implicitly_wait(10)
    element = WebDriverWait(driver, 60).until(find)
    driver.execute_script("arguments[0].click();", element)
    time.sleep(10)
except StaleElementReferenceException:
    pass
if href_data is None:
    break

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM