简体   繁体   中英

web scraping using selenium and beautifulsoup

I am trying to web-scrape grofer,and bigbasket information but i'm having trouble with the findAll() function. When i use len(imgList), the length always return 0. It always show empty list How to solve it?Can anyone help me with that? i get staus code 403 in grofer

from bs4 import BeautifulSoup
url = 'https://grofers.com/cn/grocery-staples/cid/16'
driver = webdriver.Chrome(r'C:\Users\HP\data\chromedriver.exe')
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
data = soup.findAll('plp-product__name')
print(data)
from bs4 import BeautifulSoup
response = requests.get('https://grofers.com/cn/grocery-staples/cid/16')
response
content = response.content
data = BeautifulSoup(content,'html5lib')
read = data.findAll('plp-product__name ')
read```

in ouput i get: []

You haven't included

from selenium import webdriver 
driver = webdriver.Chrome(executable_path=r'C:\Users\HP\data\chromedriver.exe')

Try

data = soup.select('div.plp-product__name ')

Or alternatively

data = soup.find_all("div",class_="plp-product__name")

Note the correct method is find_all not findAll as it was deprecated in the bs4 library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM