简体   繁体   English

Selenium 未找到具有特定类名的所有元素

[英]Selenium Not Locating ALL elements with specific class name

I'm creating a web crawler for Zillow in order to practice using Selenium.我正在为 Zillow 创建一个网络爬虫,以便练习使用 Selenium。 All I'm trying to do is get the price, address, and link to each home, but when I use find_elements_by_class_name() or find_elements_by_css_selector() , it only finds the first 9 elements, when there are many more.我想要做的就是获取每个家的价格、地址和链接,但是当我使用find_elements_by_class_name()find_elements_by_css_selector() ,它只会找到前 9 个元素,而当还有更多元素时。

Normally my selenium works fine.通常我的硒工作正常。 Does anyone know why this occurs?有谁知道为什么会发生这种情况?

from selenium import webdriver
import time

zillow_url = "https://www.zillow.com/manhattan-new-york-ny/houses/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22usersSearchTerm%22%3A%22Manhattan%2C%20New%20York%2C%20NY%22%2C%22mapBounds%22%3A%7B%22west%22%3A-74.21047920019531%2C%22east%22%3A-73.73669379980468%2C%22south%22%3A40.626191262639644%2C%22north%22%3A40.933477919520115%7D%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A12530%2C%22regionType%22%3A17%7D%5D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22beds%22%3A%7B%22min%22%3A0%2C%22max%22%3A0%7D%2C%22price%22%3A%7B%22max%22%3A400000%7D%2C%22mp%22%3A%7B%22max%22%3A1300%7D%7D%2C%22isListVisible%22%3Atrue%2C%22mapZoom%22%3A11%7D"

address = "My chrome driver address" 
driver = webdriver.Chrome(executable_path=address)
driver.get(zillow_url)

time.sleep(2)

prices = driver.find_elements_by_class_name("list-card-price")
addresses = driver.find_elements_by_class_name("list-card-addr")
links = driver.find_elements_by_class_name("list-card-link")

Try this.试试这个。

from selenium import webdriver
import time

zillow_url = "https://www.zillow.com/manhattan-new-york-ny/houses/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22usersSearchTerm%22%3A%22Manhattan%2C%20New%20York%2C%20NY%22%2C%22mapBounds%22%3A%7B%22west%22%3A-74.21047920019531%2C%22east%22%3A-73.73669379980468%2C%22south%22%3A40.626191262639644%2C%22north%22%3A40.933477919520115%7D%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A12530%2C%22regionType%22%3A17%7D%5D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22beds%22%3A%7B%22min%22%3A0%2C%22max%22%3A0%7D%2C%22price%22%3A%7B%22max%22%3A400000%7D%2C%22mp%22%3A%7B%22max%22%3A1300%7D%7D%2C%22isListVisible%22%3Atrue%2C%22mapZoom%22%3A11%7D"

address = "My chrome driver address" 
driver = webdriver.Chrome(executable_path=address)
driver.get(zillow_url)
prices = []
addresses = []
links = []
time.sleep(2)


SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while (condition):
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    prices = driver.find_elements_by_class_name("list-card-price")
    addresses = driver.find_elements_by_class_name("list-card-addr")
    links = driver.find_elements_by_class_name("list-card-link")
    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

Just put the condition as len(prices) <= number of houses you wanna scrape只需将条件设置为len(prices) <= number of houses you wanna scrape

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM