简体   繁体   English

PYthon + Selenium 中的 while 循环问题

[英]while loop issues in PYthon + Selenium

can you tell me why my while loop isn't working, please?你能告诉我为什么我的while循环不起作用吗? I get no error message, it just runs once.我没有收到错误消息,它只运行一次。

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
import time

PATH = "/Users/csongordoma/Documents/chromedriver"
driver = webdriver.Chrome(PATH)
current_page = 1
driver.get('https://ingatlan.com/lista/elado+lakas+budapest?page=' + str(current_page))
data = {}
df = pd.DataFrame(columns=['Price', 'Address', 'Size', 'Rooms', 'URL', 'Labels'])

listings = driver.find_elements_by_css_selector('div.listing__card')

while current_page < 10:
    for listing in listings:
        data['Price'] = listing.find_elements_by_css_selector('div.price')[0].text
        data['Address'] = listing.find_elements_by_css_selector('div.listing__address')[0].text
        data['Size'] = listing.find_elements_by_css_selector('div.listing__parameters')[0].text
        data['Labels'] = listing.find_elements_by_css_selector('div.listing__labels')[0].text
        data['URL'] = listing.find_elements_by_css_selector('a.listing__link.js-listing-active-area')[0].get_attribute('href')
        df = df.append(data, ignore_index=True)
        current_page += 1

print(len(listings))
print(df)

#   driver.find_element_by_xpath("//a[. = 'Következő oldal']").click()

driver.quit()

the output is a good data frame of 20 items which is one page's worth. output 是一个很好的数据框,有 20 个项目,相当于一页。 on the website I'm trying to scrape.在我试图抓取的网站上。 Set the limit at 10 cycles to not overload anyone, but ideally, I want to run through all pages.将限制设置为 10 个周期,以免任何人超载,但理想情况下,我想浏览所有页面。

Just arrange the code inside your while loop and indent the currentpage to the outer loop.只需将代码安排在您的 while 循环中,并将当前页面缩进到外部循环中。 I added a try except in case of any errors and webdriver waits for consistency of getting elements after driver.get.我添加了一个尝试,除非出现任何错误,并且 webdriver 在 driver.get 之后等待获取元素的一致性。

current_page = 1

data = {}
df = pd.DataFrame(columns=['Price', 'Address', 'Size', 'Rooms', 'URL', 'Labels'])

while current_page < 10:
    driver.get('https://ingatlan.com/lista/elado+lakas+budapest?page=' + str(current_page))
    try:
        listings=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.listing__card")))
        for listing in listings:
            data['Price'] = listing.find_elements_by_css_selector('div.price')[0].text
            data['Address'] = listing.find_elements_by_css_selector('div.listing__address')[0].text
            data['Size'] = listing.find_elements_by_css_selector('div.listing__parameters')[0].text
            data['Labels'] = listing.find_elements_by_css_selector('div.listing__labels')[0].text
            data['URL'] = listing.find_elements_by_css_selector('a.listing__link.js-listing-active-area')[0].get_attribute('href')
            df = df.append(data, ignore_index=True)
    except:
        print('Error')
    current_page += 1

print(len(listings))
print(df)

Import进口

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM