So basically I am adding this portion to my code and I have no clue whats going on. This is the link i am using https://www.digikey.com/products/en?keywords=ID82C55 All in the same Process: -So my css selector returns none. -Then it finds a couple of the html elements returns some of them. -Then finds the last element.
So this is causing my program to mix match data and yields it incorrectly to my csv file. If anyone could tell me what the problem is here? Thanks.
Code
def parse(self, response):
for b in response.css('div#pdp_content.product-details > div'):
if b.css('div.product-details-headline h1::text').get():
part = b.css('div.product-details-headline h1::text').get()
part = part.strip()
parts1 = part
print(b.css('div.product-details-headline h1::text').get())
print(parts1)
else:
print(b.css('div.product-details-headline h1::text').get())
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get():
cleaned_quantity = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get()
print(cleaned_quantity)
else:
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get())
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get():
cleaned_price = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get()
print(cleaned_price)
else:
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get())
if b.css('div.quantity-message span#dkQty::text').get():
cleaned_stock = b.css('div.quantity-message span#dkQty::text').get()
print(cleaned_stock)
else:
print(b.css('div.quantity-message span#dkQty::text').get())
if b.css('table#product-attribute-table > tr:nth-child(7) td::text').get():
status = b.css('table#product-attribute-table > tr:nth-child(7) td::text').get()
status = status.strip()
cleaned_status = status
print(cleaned_status)
else:
print(b.css('table#product-attribute-table > tr:nth-child(7) td::text').get())
# yield {
# 'Part': parts1,
# 'Quantity': cleaned_quantity,
# 'Price': cleaned_price,
# 'Stock': cleaned_stock,
# 'Status': cleaned_status,
# }
Output
None
None
None
None
None
None
2,500
29.10828
29
None
ID82C55A
ID82C55A
None
None
None
Active
I highly recommend you to switch to XPath expressions:
part_number = b.xpath('.//th[.="Manufacturer Part Number"]/following-sibling::td[1]/text()').get()
stock = b.xpath('.//span[.="In Stock"]/preceding-sibling::span[1]/text()').get()
etc.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.