[英]How to scrape data using selenium and python, I am trying to extract all the data which is in title div tag
enter image description here在此处输入图像描述
from selenium import webdriver
import pandas as pd
import time
import requests
from selenium.common.exceptions import ElementClickInterceptedException
driver = webdriver.Chrome(executable_path ="D:\\chromedriver_win32\chromedriver.exe")
url = "https://www.fynd.com/brands/"
driver.get(url)
time.sleep(2)
driver.maximize_window()
luxury_brand_names = []
element = driver.find_element_by_css_selector("//div[@class='group-cards']")#.get_attribute("title")
#element = driver.find_elements_by_xpath("//div[@classdata-v-2f624c7c data-v-73869697 title]")
for a in element:
luxury_brand_names.append()
print(luxury_brand_names)
this is the code I am running and I am not getting any output, please help me with this, I am very new with coding and scraping data.这是我正在运行的代码,我没有得到任何 output,请帮助我,我对编码和抓取数据非常陌生。 I am trying to get all the data that is in the title div tag.
我正在尝试获取标题 div 标签中的所有数据。
I think the only things you need are to change your selector, identify with find_elements
, and loop through the elements.我认为您唯一需要做的就是更改选择器,使用
find_elements
进行识别,然后遍历元素。 Also you need to actually pass a value in to append()
.您还需要实际将一个值传递给
append()
。 It should be它应该是
elements = driver.find_elements_by_css_selector("div.card-item")
for element in elements:
luxury_brand_names.append(element.get_attribute('title'))
first of all your append()
is empty, nothing is added to the list首先,您的
append()
为空,列表中没有添加任何内容
as second - need to change element = driver.find_elements_by_css_selector("//div[@class='card-item']")
to be as a list of items, so you can use it in your loop like:作为第二个 - 需要将
element = driver.find_elements_by_css_selector("//div[@class='card-item']")
更改为项目列表,因此您可以在循环中使用它,例如:
luxury_brand_names.append(a.get_attribute("title")
Here is the answer using Beautiful Soup and selenium together -这是一起使用 Beautiful Soup 和 selenium 的答案 -
from bs4 import BeautifulSoup
from selenium import webdriver
url = "https://www.fynd.com/brands/"
driver = webdriver.Chrome(executable_path ="D:\\chromedriver_win32\chromedriver.exe")
driver.get(url)
soup = BeautifulSoup(driver.page_source,"html.parser")
title = soup.find_all('span',{'class':'ukt-title clrWhite'})
all_titles = list()
for jelly in range(len(title)):
all_titles.append(title[jelly].text.strip())
print(all_titles)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.