简体   繁体   English

如何在python selenium中循环多个元素(不同的CSS选择器)

[英]How to loop multiple elements in python selenium (different CSS selectors)

I am trying to loop inside a class offer-list-wrapper which has multiple elements inside, almost all the elements are common in the web page for search A and search B (I am scraping a crawler).我试图在一个包含多个元素的类offer-list-wrapper循环,几乎所有元素在搜索 A 和搜索 B 的网页中都是常见的(我正在抓取一个爬虫)。

As you can see in both images, offer-list-wrapper is a common element.正如您在两张图片中看到的, offer-list-wrapper是一个常见元素。

I want to extract the data that is inside every organic-offer-wrapper organic-gallery-offer-inner and organic-list-offer-inner m-gallery-product-item-v2 classes.我想提取每个organic-offer-wrapper organic-gallery-offer-innerorganic-list-offer-inner m-gallery-product-item-v2类中的数据。 Which is very easy to do if you loop inside them with a CSS selector like this:如果你使用这样的 CSS 选择器在它们内部循环,这很容易做到:

for element in driver.find_elements_by_css_selector('.organic-list-offer-inner.m-gallery-product-item-v2'):

In that way you can get every element inside them.通过这种方式,您可以获取其中的每个元素。

在此处输入图片说明

在此处输入图片说明

BUT the issue starts here: I need to loop inside both cases with ONE generic code that loop inside both classes, and in case a new class appears it has to loop inside it.但问题从这里开始:我需要使用一个在两个类中循环的通用代码在两种情况下循环,如果出现新类,它必须在其中循环。

Let me show you my code:让我向您展示我的代码:

for element in driver.find_elements_by_class_name('offer-list-wrapper'):
    try:
        item_name = element.find_element_by_class_name('organic-gallery-title__content').text
    except:
        item_name = np.nan
    try:
        price = element.find_element_by_class_name('gallery-offer-price').get_attribute('title').replace('$', '').replace(',', '')
        min_order = element.find_element_by_class_name('gallery-offer-minorder').find_element_by_tag_name('span').text.replace(' Pieces', '').replace(' Piece', '').replace(' Units', '').replace(' Unit', '').replace(' Sets', '').replace(' Set', '').replace(' Pairs', '').replace(' Pair', '').replace('Boxes', '').replace('Box', '').replace('Bags', '').replace('Bag', '')     
        # separate min and max price
    except:
        price = np.nan
        min_order = np.nan

This first one returns only the first element:第一个只返回第一个元素:

for element in driver.find_elements_by_css_selector('.organic-offer-wrapper.organic-gallery-offer-inner'):
    try:
        item_name = element.find_element_by_class_name('organic-gallery-title__content').text
    except:
        item_name = np.nan
    try:
        price = element.find_element_by_class_name('gallery-offer-price').get_attribute('title').replace('$', '').replace(',', '')
        min_order = element.find_element_by_class_name('gallery-offer-minorder').find_element_by_tag_name('span').text.replace(' Pieces', '').replace(' Piece', '').replace(' Units', '').replace(' Unit', '').replace(' Sets', '').replace(' Set', '').replace(' Pairs', '').replace(' Pair', '').replace('Boxes', '').replace('Box', '').replace('Bags', '').replace('Bag', '')     
        # separate min and max price
    except:
        price = np.nan
        min_order = np.nan

This second one only loops inside .organic-offer-wrapper.organic-gallery-offer-inner (returning all elements that I need), but it doesn't loop inside .organic-list-offer-inner.m-gallery-product-item-v2第二个只在.organic-offer-wrapper.organic-gallery-offer-inner循环(返回我需要的所有元素),但它不会在.organic-list-offer-inner.m-gallery-product-item-v2内循环.organic-list-offer-inner.m-gallery-product-item-v2

You can get all the products by searching for the div tags that contain the attribute data-content="productItem" .您可以通过搜索包含属性data-content="productItem"的 div 标签来获取所有产品。 That is assuming each item has that attribute.那是假设每个项目都具有该属性。 From the screenshots you posted, it seems like that is the case.从您发布的屏幕截图来看,情况似乎是这样。

You can accomplish this using find_elements_by_xpath()您可以使用 find_elements_by_xpath() 完成此操作

for item in driver.find_elements_by_xpath('//div[@data-content="productItem"]'):
    ....

This would probably be the best way without having to worry about the elements having different css classes.这可能是最好的方法,而不必担心具有不同 css 类的元素。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何通过同时调用不同的 css 选择器来抓取 Selenium/Python 中的元素? - How to scrape elements in Selenium/Python by calling different css selectors at the same time? 如何在 python 上获取 selenium web 驱动程序以在以下页面的 ZC7A628CBA22E28EB17B5F5C6AE2A26 上查找元素? - How to get selenium web driver on python to find elements on css selectors of a following page? 如何使用python硒在单个循环中迭代多个元素 - How to iterate multiple elements in single loop using python selenium Selenium:如何处理 DOM 中无效的 CSS 选择器 - Selenium: How to handle invalid CSS selectors in DOM 从Selenium for Python中具有相同类的多个元素中获得不同的值? - Get the different value from multiple elements with the same class in Selenium for Python? Python selenium:查找名称部分不同的多个元素 - Python selenium: finding multiple elements with partially different names 如何在 selenium python BeautifulSoup 上循环多个页面 - How to loop multiple page on selenium python BeautifulSoup 如何在 Python 中使用 selenium 循环保存多个屏幕截图 - How to save multiple screenshots in loop with selenium in Python 某些(但不是全部)的Selenium(通过Python)错误:不是(...)CSS选择器 - Selenium (via Python) errors on SOME (but not all) :not(…) CSS selectors 使用Python选择多个选择器以提高硒的特异性 - Selecting Multiple Selectors for increased specificity in Selenium Using Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM