[英]How to set a default value when Scrapy selector with extract() returns None?
I am trying to yield the value of a tag that isn't always present in the pages that I scrape with Scrapy.我试图产生一个标签的价值,这个标签并不总是出现在我用 Scrapy 抓取的页面中。 I am using the extract()
function rather than extract_first()
.我使用的是extract()
函数而不是extract_first()
。 Therefore I cannot seem to set a default value, like suggested in this SO post .因此,我似乎无法设置默认值,就像这篇 SO post 中建议的那样。
This doesn't work:这不起作用:
def parse(self, response):
yield {
'comments': response.css('[itemprop=commentCount]::attr(content)').extract(default=None)
}
How can I set None
as default when I want to use extract()
rather than extract_first()
?当我想使用extract()
而不是extract_first()
时,如何将None
设置为默认值?
Thanks very much in advance!首先十分感谢!
Try this syntax:试试这个语法:
{'comments': response.css('[itemprop=commentCount]::attr(content)').extract() or None}
If result of response.css(CSS)
is empty list, then None
will be assigned as value of comments
key.如果response.css(CSS)
是空列表,则None
将被分配为comments
键的值。 Otherwise, actual value will be assigned否则,将分配实际值
.extract()
yields the output as a list and .extract_first()
yields a string. .extract()
将输出作为列表产生,而.extract_first()
产生一个字符串。
response.xpath('xpath_of_the_component').extract_first(default="default_value").split()
This line of code will again convert the string to a list and set the default value, if not available.这行代码将再次将字符串转换为列表并设置默认值(如果不可用)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.