简体   繁体   English

不能 selenium 中的 select 元素(Python)

[英]Can't select elements in selenium (Python)

I'm using Python for webscparing but the page has to be scrolled to load all the content so I use selenium.我正在使用 Python 进行 webscparing,但必须滚动页面才能加载所有内容,所以我使用 selenium。 I could make the first part work so the web driver launches, presses accept cookies and scrolls x times (in the code above is 2 times because I had to wait 5 minutes to get a blank list T_T)我可以使第一部分工作,因此 web 驱动程序启动,按下接受 cookies 并滚动 x 次(在上面的代码中是 2 次,因为我必须等待 5 分钟才能获得空白列表 T_T)

from msilib.schema import Class
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys

url = "https://www.aboutyou.es/c/hombre/zapatos-20215"
opt = webdriver.ChromeOptions()
opt.add_argument("start-maximized")

driver = webdriver.Chrome(options = opt) 
driver.get(url)


cookies = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, '//button[@id="onetrust-accept-btn-handler"]'))
    ).click()

#element = driver.find_element(By.XPATH, '//button[@data.testid="loadMoreButton_100"]')

html = driver.find_element(By.TAG_NAME, 'html')
intentos = 2

try:
        mas = WebDriverWait(driver, 15).until(
        EC.presence_of_element_located((By.XPATH, '//button[@data-testid="loadMoreButton_100"]'))
        ).click()
except:
        html.send_keys(Keys.PAGE_DOWN)
        html.send_keys(Keys.PAGE_DOWN)
        html.send_keys(Keys.PAGE_DOWN)

for i in range(intentos):
    try:
         mas = WebDriverWait(driver, 1).until(
         EC.presence_of_element_located((By.XPATH, '//button[@data-testid="loadMoreButton_100"]'))
            ).click()
    except:
        html.send_keys(Keys.PAGE_DOWN)
        html.send_keys(Keys.PAGE_DOWN)
        html.send_keys(Keys.PAGE_DOWN)
        html.send_keys(Keys.PAGE_DOWN)
        if i < intentos - 1: 
            continue

grid_grande = driver.find_elements(By.XPATH,'//a[class="sc-16ol3xi-0 sc-163x4qs-0 fybchu loqbdm sc-nlxe42-2 fwTCrr"]')
print(grid_grande)

The element I want to select is the grid that contains all the other data, but I only get a blank list []:我想要 select 的元素是包含所有其他数据的网格,但我只得到一个空白列表 []:

<a data-testid="productTile-4218512" style="--product-tile-contents-height:112px" class="sc-16ol3xi-0 sc-163x4qs-0 fybchu loqbdm sc-nlxe42-2 fwTCrr" href="/p/panama-jack/botas-con-cordones-4218512"><div data-testid="productImage" class="sc-mt3y39-0 iYaafh">
<img height="100%" width="100%" decoding="async" importance="auto" loading="lazy" sizes="(max-width: 767px) calc(100vw / 3), calc(100vw / 4)" srcset="https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&amp;quality=75&amp;trim=1&amp;height=160&amp;width=120 120w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&amp;quality=75&amp;trim=1&amp;height=480&amp;width=360 360w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&amp;quality=75&amp;trim=1&amp;height=534&amp;width=400 400w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&amp;quality=75&amp;trim=1&amp;height=800&amp;width=600 600w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&amp;quality=75&amp;trim=1&amp;height=1067&amp;width=800 800w, https://cdn.aboutstatic.com/file/8742f6c70de3baecb60acc24c5f5d3d7?brightness=0.96&amp;quality=75&amp;trim=1&amp;height=1280&amp;width=960 960w" style="border-radius:2px" alt="PANAMA JACK - Botas con cordones en marrón: frente" data-testid="productImageView" class="sc-1876d5f-0-Component giShmP">
<div class="sc-1i699m5-0 eHwLkT"><div data-testid="badge-GENERIC" class="sc-1dqvaay-1 cHhMjt">Más sostenible</div></div>
</div><button type="button" data-testid="wishListButton" class="sc-1yegbck-0 cFfJJS sc-122ag38-0 eHyXBK sc-1ytk4ze-1 jrOzwu sc-1cy39j4-0 eCBNan"><svg class="sc-vu2m91-0 cXGjqJ sc-1ytk4ze-0 ebHRsM" data-testid="WishListIcon"><use xlink:href="#/assets/media/ic-heart.e31e11e8.svg"></use></svg><div class="sc-122ag38-1 ixqHjB"></div></button><div class="sc-nlxe42-0 kRHZwU"><div class="sc-1qsfqrd-0 xHpAu"><p data-testid="brandName" class="sc-1vt6vwe-0 sc-1vt6vwe-2 sc-1qsfqrd-1 dmJKga cyVcre gtGpeQ">PANAMA JACK</p><div class="sc-18q4lz4-2 cySBlJ sc-1qsfqrd-6 khWqDb" data-testid="priceBox"><span data-testid="finalPrice" class="sc-2qclq4-0 sc-18q4lz4-0 ePNAqF fbtbBY">169,00 €</span></div><div class="sc-1qsfqrd-7 eUQMHN"><ul data-testid="ColorContainer" class="sc-1qsfqrd-3 eSoPTy">
<li data-testid="ColorBubble-simple-#663300" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS dhRoGM sc-1qsfqrd-8 dYSOSZ"></li><li data-testid="ColorBubble-simple-#000000" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS gmeSfI sc-1qsfqrd-8 dYSOSZ"></li><li data-testid="ColorBubble-simple-#663300" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS dhRoGM sc-1qsfqrd-8 dYSOSZ"></li><li data-testid="ColorBubble-simple-#4c2002" class="sc-kt3zrg-0 sc-kt3zrg-1 jEkiIS hFwoRv sc-1qsfqrd-8 dYSOSZ"></li><li class="sc-1qsfqrd-4 glNrlz">+<!-- -->2</li></ul><span data-testid="Sizes" class="sc-1qsfqrd-5 gZDHxk">Disponible en muchas tallas</span></div></div></div></a>
wait=WebDriverWait(driver,60) 
driver.get("https://www.aboutyou.es/c/hombre/zapatos-20215")
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[@id='onetrust-accept-btn-handler']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//*[@id='modalContent']/div[1]/*[name()='svg']"))).click()

SCROLL_PAUSE_TIME = 3

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
elems=wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//a[starts-with(@data-testid,'productTile')]")))
for elem in elems:
    print(elem.get_attribute('outerHTML'))

I'm not sure what your expected output is so I just grabbed all the a tags with those product tiles.我不确定您期望的 output 是什么,所以我只是抓住了所有带有这些产品图块的 a 标签。

The key issue would be waiting for visibility of your elements to come up and then grabbing the data.关键问题是等待元素的可见性出现,然后获取数据。

Imports:进口:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM