如何使用 selenium 抓取鏈接和圖像

Question

我想獲取這些角色的鏈接和圖像

網站鏈接

https://opensea.io/collection/meebits?search[sortAscending]=false&search[sortBy]=FAVORITE_COUNT

XPATH 選擇代碼

coll_name = driver.find_elements(By.XPATH, '//h1')
coll_desc = driver.find_elements(By.XPATH, '//div[@class="sc-1xf18x6-0 sc-1aqfqq9-0 sc-1y1ib3i-7 haVRLx dfsEJr eGsklH"]')
profiles = driver.find_element(By.XPATH, '//div[@role="grid"]/div')
for profile in profiles:
    art_name = driver.find_elements(By.XPATH, '/div[@class="sc-7qr9y8-0 sc-dw611d-1 iUvoJs fcpvjL"]')
    art_price = driver.find_elements(By.XPATH, '//div[@class="sc-7qr9y8-0 iUvoJs Price--amount"]')
    art_link = driver.find_elements(By.LINK_TEXT, '(//link/@href)[16]')

FOR 循環代碼

for c in coll_name:
    collection_name.append(c.text)
    time.sleep(1)


for d in coll_desc:
    collection_desc.append(d.text)
    time.sleep(1)

for n in art_name:
    artname.append(n.text)
    time.sleep(1)

for p in art_price:
    price.append(p.text)
    time.sleep(1)
    
for l in art_link:
    link.append(n.text)
    time.sleep(1)

請幫助我解決這個問題

Answer 1

以下代碼將滾動頁面，獲取其中的項目，再次滾動，等等。 Selenium 設置是針對 linux 的，你只需要注意導入，以及定義驅動程序/瀏覽器后的代碼：

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import time as t
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

chrome_options = Options()
chrome_options.add_argument("--no-sandbox")


webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)

df_list = []
url = 'https://opensea.io/collection/meebits?search%5BsortAscending%5D=false&search%5BsortBy%5D=FAVORITE_COUNT'
browser.get(url)
while True:
    try:
        items = WebDriverWait(browser, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@role='grid']/div[@role='gridcell']")))
        for item in items:
            print(item.text)
            print('______________')
    except Exception as e:
        continue
    browser.execute_script("window.scrollTo(0,document.body.scrollHeight);")
    t.sleep(5)

幾點注意事項：

在嘗試找到它們之前，請注意導入，以及如何等待元素在頁面中加載
查找每個項目所需的具體詳細信息（注意：某些項目的詳細信息不完整，因此對每個項目的詳細信息使用 try/except），創建一個包含所有項目元素的元組，並將其 append 到在開頭初始化的 df_list編碼。 一旦獲得所有元素，您將能夠轉換 dataframe 中的元組列表
現在的代碼是一個無限循環； 一旦沒有更多要加載的元素，您需要編寫一個條件來中斷循環
如果您在理解我上面寫的任何內容時遇到問題，請重新閱讀 Python 和 Selenium 4 的基礎知識。

如何使用 selenium 抓取鏈接和圖像

問題描述

1 個解決方案

解決方案1
0 2022-07-30 14:34:28

如何使用 selenium 抓取鏈接和圖像

問題描述

1 個解決方案

解決方案1 0 2022-07-30 14:34:28

解決方案1
0 2022-07-30 14:34:28