Hi i'm new to web scraping and i'm trying to follow a tutorial but i have issues accessing certain items! This is the page i want to scrape https://www.newegg.com/todays-deals?cm_sp=Homepage_4spots-_--_-12182020 , and I want to get the title, brand and the price of the product, everything works fine outside of the loop! but i have errors while creating the loop for all the products
#this is the loop to scrape all items from the webpage
containers = pagesoup.findAll("div",{"class":"item-container"})
for con in containers:
title = con.img["title"]
titleco=con.findAll("div",{"class":"item-branding"})
brand= titleco[0].img["title"]
priceco=con.findAll("li",{"class":"price-current"})
priceco[0].text.strip()
i get this error
----> 5 brand= titleco[0].img["title"] 'NoneType' object is not subscriptable
Not every item-branding
item on your page has an img
: so in some cases, titleco[0].img
is None
, hence why you get an error when trying to access the "title"
element.
You run into another issue later with your price-current
, too: sometimes you find zero matches, hence you get an error when trying to access the first element of the ResultSet
via priceco[0]
. Or at least I do, but your site seems to be partially unavailable to my country, so you may not get the same results.
Here's a version of your code that runs:
containers = pagesoup.findAll("div", {"class": "item-container"})
for con in containers:
title = con.img["title"]
titleco = con.findAll("div", {"class": "item-branding"})
if titleco[0].img != None:
brand = titleco[0].img["title"]
priceco = con.findAll("li", {"class": "price-current"})
if len(priceco) > 0:
priceco[0].text.strip()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.