[英]{'productname': None, 'productphoto': None, 'productprice': None}. Css selector is not returning anything in scrapy project
[英]scrapy css selector returning None then finds value
所以基本上我將這部分添加到我的代碼中,我不知道發生了什么。 這是我使用https://www.digikey.com/products/en?keywords=ID82C55的鏈接都在同一個過程中: - 所以我的 css 選擇器沒有返回。 - 然后它發現幾個 html 元素返回其中一些。 - 然后找到最后一個元素。
所以這導致我的程序混合匹配數據並將其錯誤地生成到我的 csv 文件中。 如果有人能告訴我這里有什么問題嗎? 謝謝。
代碼
def parse(self, response):
for b in response.css('div#pdp_content.product-details > div'):
if b.css('div.product-details-headline h1::text').get():
part = b.css('div.product-details-headline h1::text').get()
part = part.strip()
parts1 = part
print(b.css('div.product-details-headline h1::text').get())
print(parts1)
else:
print(b.css('div.product-details-headline h1::text').get())
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get():
cleaned_quantity = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get()
print(cleaned_quantity)
else:
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get())
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get():
cleaned_price = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get()
print(cleaned_price)
else:
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get())
if b.css('div.quantity-message span#dkQty::text').get():
cleaned_stock = b.css('div.quantity-message span#dkQty::text').get()
print(cleaned_stock)
else:
print(b.css('div.quantity-message span#dkQty::text').get())
if b.css('table#product-attribute-table > tr:nth-child(7) td::text').get():
status = b.css('table#product-attribute-table > tr:nth-child(7) td::text').get()
status = status.strip()
cleaned_status = status
print(cleaned_status)
else:
print(b.css('table#product-attribute-table > tr:nth-child(7) td::text').get())
# yield {
# 'Part': parts1,
# 'Quantity': cleaned_quantity,
# 'Price': cleaned_price,
# 'Stock': cleaned_stock,
# 'Status': cleaned_status,
# }
Output
None
None
None
None
None
None
2,500
29.10828
29
None
ID82C55A
ID82C55A
None
None
None
Active
我強烈建議您切換到 XPath 表達式:
part_number = b.xpath('.//th[.="Manufacturer Part Number"]/following-sibling::td[1]/text()').get()
stock = b.xpath('.//span[.="In Stock"]/preceding-sibling::span[1]/text()').get()
etc.
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.