web 使用 selenium 和 beautifulsoup 進行抓取

Question

我正在嘗試網絡抓取 grofer 和 bigbasket 信息，但我在使用 findAll() function 時遇到了問題。當我使用 len(imgList) 時，長度總是返回 0。它總是顯示空列表如何解決？可以有人幫我嗎？ 我在 grofer 中得到狀態代碼 403

from bs4 import BeautifulSoup
url = 'https://grofers.com/cn/grocery-staples/cid/16'
driver = webdriver.Chrome(r'C:\Users\HP\data\chromedriver.exe')
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
data = soup.findAll('plp-product__name')
print(data)

from bs4 import BeautifulSoup
response = requests.get('https://grofers.com/cn/grocery-staples/cid/16')
response
content = response.content
data = BeautifulSoup(content,'html5lib')
read = data.findAll('plp-product__name ')
read```

在輸出中我得到： []

Answer 1

你沒有包括

from selenium import webdriver 
driver = webdriver.Chrome(executable_path=r'C:\Users\HP\data\chromedriver.exe')

嘗試

data = soup.select('div.plp-product__name ')

或者替代地

data = soup.find_all("div",class_="plp-product__name")

請注意，正確的方法是find_all而不是findAll ，因為它在 bs4 庫中已被棄用。

web 使用 selenium 和 beautifulsoup 進行抓取

問題描述

1 個解決方案

解決方案1
0 已采納 2020-07-17 10:50:31

web 使用 selenium 和 beautifulsoup 進行抓取

問題描述

1 個解決方案

解決方案1 0 已采納 2020-07-17 10:50:31

解決方案1
0 已采納 2020-07-17 10:50:31