![](/img/trans.png)
[英]Next Page Iteration in Selenium/BeautfulSoup for Scraping E-Commerce Website
[英]Scraping nested element on e-commerce website
當我訪問特定的產品頁面時,我試圖用 Selenium 從 Target 的網站上抓取產品 img url 但沒有返回。
這是我的那部分代碼:
# ADD THE IMAGE URL
j = 0
found = False
while(j < 5 and not found):
try:
img_panel = driver.find_element_by_class_name('slideDeckPicture')
img_panel = img_panel.find_element_by_tag_name('img')
img_name = img.get_attribute('alt')
img_url = img_panel.get_attribute('src')
# img_urls.append(img_url)
line += ',"' + img_url + '"'
found = True
break
# if it can't find the image, it probably hasn't loaded. wait and try again.
except:
j += 1
time.sleep(4)
# img_urls.append('NO URL')
# pass
# if we've tried 5 times add no url
if found == False:
line += ',NO IMG URL'
url列表包含您要查找的網址:
url = "https://www.target.com/p/revolution-beauty-conceal-define-concealer-0-11-fl-oz/-/A-82003638?preselect=81551727#lnk=sametab"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0"}
resp = rq.get(url, headers=headers)
soup = bs(resp.content)
divs_img = soup.find_all("div", attrs={"data-test": "product-image"})[0]
urls = [i["src"] for i in divs_img.find_all("img") if i["src"].startswith("https")]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.