Scrapy ValueError: url 不能是無

Question

介紹

我必須創建一個蜘蛛，它可以抓取https://www.karton.eu/einwellig-ab-100-mm的信息以及在跟隨產品鏈接到它自己的頁面后可以抓取的產品的重量。

運行我的代碼后，我收到以下錯誤消息：

我已經檢查過 url 是否損壞，所以在我的 scrapy shell 中我可以獲取它。

使用代碼：

import scrapy
from ..items import KartonageItem

class KartonSpider(scrapy.Spider):
    name = "kartons"
    allow_domains = ['karton.eu']
    start_urls = [
        'https://www.karton.eu/einwellig-ab-100-mm'
        ]
    custom_settings = {'FEED_EXPORT_FIELDS': ['SKU', 'Title', 'Link', 'Price', 'Delivery_Status', 'Weight'] } 
    def parse(self, response):
        card = response.xpath('//div[@class="text-center artikelbox"]')

        for a in card:
            items = KartonageItem()
            link = a.xpath('@href')
            items ['SKU'] = a.xpath('.//div[@class="signal_image status-2"]/small/text()').get()
            items ['Title'] = a.xpath('.//div[@class="title"]/a/text()').get()
            items ['Link'] = link.get()
            items ['Price'] = a.xpath('.//div[@class="price_wrapper"]/strong/span/text()').get()
            items ['Delivery_Status'] = a.xpath('.//div[@class="signal_image status-2"]/small/text()').get()
            yield response.follow(url=link.get(),callback=self.parse, meta={'items':items})

    def parse_item(self,response):
        table = response.xpath('//span[@class="staffelpreise-small"]')

        items = KartonageItem()
        items = response.meta['items']
        items['Weight'] = response.xpath('//span[@class="staffelpreise-small"]/text()').get()
        yield items

是什么導致了這個錯誤？

Answer 1

問題是您的link.get()返回None值。 看來問題出在你的XPath。

def parse(self, response):
    card = response.xpath('//div[@class="text-center artikelbox"]')

    for a in card:
        items = KartonageItem()
        link = a.xpath('@href')

雖然card變量選擇了幾個div標簽，但該 div 的自軸中沒有@href （這就是它返回空的原因），但后代中a標簽。 所以我相信這應該會給你預期的結果：

def parse(self, response):
    card = response.xpath('//div[@class="text-center artikelbox"]')

    for a in card:
        items = KartonageItem()
        link = a.xpath('a/@href') # FIX HERE <<<<<

Scrapy ValueError: url 不能是無

問題描述

介紹

1 個解決方案

解決方案1
1 已采納 2020-07-29 17:52:36

Scrapy ValueError: url 不能是無

問題描述

介紹

1 個解決方案

解決方案1 1 已采納 2020-07-29 17:52:36

解決方案1
1 已采納 2020-07-29 17:52:36