简体   繁体   English

当带有extract()的Scrapy选择器返回None时如何设置默认值?

[英]How to set a default value when Scrapy selector with extract() returns None?

I am trying to yield the value of a tag that isn't always present in the pages that I scrape with Scrapy.我试图产生一个标签的价值,这个标签并不总是出现在我用 Scrapy 抓取的页面中。 I am using the extract() function rather than extract_first() .我使用的是extract()函数而不是extract_first() Therefore I cannot seem to set a default value, like suggested in this SO post .因此,我似乎无法设置默认值,就像这篇 SO post 中建议的那样。

This doesn't work:这不起作用:

def parse(self, response):
        yield {
          'comments': response.css('[itemprop=commentCount]::attr(content)').extract(default=None)
          }

How can I set None as default when I want to use extract() rather than extract_first() ?当我想使用extract()而不是extract_first()时,如何将None设置为默认值?

Thanks very much in advance!首先十分感谢!

Try this syntax:试试这个语法:

{'comments': response.css('[itemprop=commentCount]::attr(content)').extract() or None}

If result of response.css(CSS) is empty list, then None will be assigned as value of comments key.如果response.css(CSS)是空列表,则None将被分配为comments键的值。 Otherwise, actual value will be assigned否则,将分配实际值

.extract() yields the output as a list and .extract_first() yields a string. .extract()将输出作为列表产生,而.extract_first()产生一个字符串。

response.xpath('xpath_of_the_component').extract_first(default="default_value").split()

This line of code will again convert the string to a list and set the default value, if not available.这行代码将再次将字符串转换为列表并设置默认值(如果不可用)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM