[英]How to scrape links and images by using selenium
I want get the links and images of these character我想获取这些角色的链接和图像
Website Link网站链接
https://opensea.io/collection/meebits?search[sortAscending]=false&search[sortBy]=FAVORITE_COUNT https://opensea.io/collection/meebits?search[sortAscending]=false&search[sortBy]=FAVORITE_COUNT
XPATH SELECTION CODE XPATH 选择代码
coll_name = driver.find_elements(By.XPATH, '//h1')
coll_desc = driver.find_elements(By.XPATH, '//div[@class="sc-1xf18x6-0 sc-1aqfqq9-0 sc-1y1ib3i-7 haVRLx dfsEJr eGsklH"]')
profiles = driver.find_element(By.XPATH, '//div[@role="grid"]/div')
for profile in profiles:
art_name = driver.find_elements(By.XPATH, '/div[@class="sc-7qr9y8-0 sc-dw611d-1 iUvoJs fcpvjL"]')
art_price = driver.find_elements(By.XPATH, '//div[@class="sc-7qr9y8-0 iUvoJs Price--amount"]')
art_link = driver.find_elements(By.LINK_TEXT, '(//link/@href)[16]')
FOR LOOP CODE FOR 循环代码
for c in coll_name:
collection_name.append(c.text)
time.sleep(1)
for d in coll_desc:
collection_desc.append(d.text)
time.sleep(1)
for n in art_name:
artname.append(n.text)
time.sleep(1)
for p in art_price:
price.append(p.text)
time.sleep(1)
for l in art_link:
link.append(n.text)
time.sleep(1)
PLEASE HELP ME IN SOLVING THIS ISSUE请帮助我解决这个问题
The following code will scroll the page, get the items on it, scroll again, and so on.以下代码将滚动页面,获取其中的项目,再次滚动,等等。 Selenium setup is for linux, you just need to pay attention to imports, and to the code after defining the driver/browser: Selenium 设置是针对 linux 的,你只需要注意导入,以及定义驱动程序/浏览器后的代码:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import time as t
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
df_list = []
url = 'https://opensea.io/collection/meebits?search%5BsortAscending%5D=false&search%5BsortBy%5D=FAVORITE_COUNT'
browser.get(url)
while True:
try:
items = WebDriverWait(browser, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@role='grid']/div[@role='gridcell']")))
for item in items:
print(item.text)
print('______________')
except Exception as e:
continue
browser.execute_script("window.scrollTo(0,document.body.scrollHeight);")
t.sleep(5)
A couple of notes:几点注意事项:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.