Scrapy / Python：替換空字符串

Question

這是我的Scrapy搜尋器代碼。 我正在嘗試從網站中提取元數據值。 沒有元數據在頁面上出現多次。

class MySpider(BaseSpider):
    name = "courses"
    start_urls = ['http://www.example.com/listing']
    allowed_domains = ["example.com"]
    def parse(self, response):
     hxs = Selector(response)
    #for courses in response.xpath(response.body):
     for courses in response.xpath("//meta"):
     yield {
                'ScoreA': courses.xpath('//meta[@name="atarbur"]/@content').extract_first(),
                'ScoreB': courses.xpath('//meta[@name="atywater"]/@content').extract_first(),
                'ScoreC': courses.xpath('//meta[@name="atarsater"]/@content').extract_first(),
                'ScoreD': courses.xpath('//meta[@name="clearlywaur"]/@content').extract_first(),
               }
     for url in hxs.xpath('//ul[@class="scrapy"]/li/a/@href').extract():
      yield Request(response.urljoin(url), callback=self.parse)

因此，我要嘗試實現的是，如果任何“分數”的值是一個空字符串（''），我想將其替換為0（零）。 我不確定如何在“ yield”塊內添加條件邏輯。

任何幫助都非常感謝。

謝謝

Answer 1

extract_first()方法具有默認值的可選參數，但是在您的情況下，您可以使用or表達式：

foo = response.xpath('//foo').extract_first('').strip() or 0

在這種情況下，如果extract_first()返回一個沒有任何文本的字符串，它將被評估為False，因此將采用evalution（0）的最新成員。

要將字符串類型轉換為其他類型，請嘗試：

foo = int(response.xpath('//foo').extract_first('').strip() or 0)

Scrapy / Python：替換空字符串

問題描述

1 個解決方案

解決方案1
4 已采納 2017-05-23 07:43:15

Scrapy / Python：替換空字符串

問題描述

1 個解決方案

解決方案1 4 已采納 2017-05-23 07:43:15

解決方案1
4 已采納 2017-05-23 07:43:15