[英]How to scrape links and images by using selenium
我想获取这些角色的链接和图像
网站链接
https://opensea.io/collection/meebits?search[sortAscending]=false&search[sortBy]=FAVORITE_COUNT
XPATH 选择代码
coll_name = driver.find_elements(By.XPATH, '//h1')
coll_desc = driver.find_elements(By.XPATH, '//div[@class="sc-1xf18x6-0 sc-1aqfqq9-0 sc-1y1ib3i-7 haVRLx dfsEJr eGsklH"]')
profiles = driver.find_element(By.XPATH, '//div[@role="grid"]/div')
for profile in profiles:
art_name = driver.find_elements(By.XPATH, '/div[@class="sc-7qr9y8-0 sc-dw611d-1 iUvoJs fcpvjL"]')
art_price = driver.find_elements(By.XPATH, '//div[@class="sc-7qr9y8-0 iUvoJs Price--amount"]')
art_link = driver.find_elements(By.LINK_TEXT, '(//link/@href)[16]')
FOR 循环代码
for c in coll_name:
collection_name.append(c.text)
time.sleep(1)
for d in coll_desc:
collection_desc.append(d.text)
time.sleep(1)
for n in art_name:
artname.append(n.text)
time.sleep(1)
for p in art_price:
price.append(p.text)
time.sleep(1)
for l in art_link:
link.append(n.text)
time.sleep(1)
请帮助我解决这个问题
以下代码将滚动页面,获取其中的项目,再次滚动,等等。 Selenium 设置是针对 linux 的,你只需要注意导入,以及定义驱动程序/浏览器后的代码:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import time as t
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
df_list = []
url = 'https://opensea.io/collection/meebits?search%5BsortAscending%5D=false&search%5BsortBy%5D=FAVORITE_COUNT'
browser.get(url)
while True:
try:
items = WebDriverWait(browser, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@role='grid']/div[@role='gridcell']")))
for item in items:
print(item.text)
print('______________')
except Exception as e:
continue
browser.execute_script("window.scrollTo(0,document.body.scrollHeight);")
t.sleep(5)
几点注意事项:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.