使用scrapy中的itemloader返回给定xpath找不到的项目的默认响应

Question

I need to return default value when no value is returned by the specified xpath in itemloader. 当itemloader中指定的xpath不返回任何值时，我需要返回默认值。 Here is a bit of my spider. 这是我的蜘蛛。 Am using very basic version of the itemloader : 我正在使用非常基本的itemloader版本：

il = ItemLoader(item = HomesItem(), response=response)
il.add_xpath('Company_Name', u'//*[@id="anchor_realtorOutline"]/div[1]/table/tbody/tr/th[contains(text(), "会社名")]/following-sibling::td/p[1]/ruby/text()')

So if this xpath does not return a value then i want to store N/A in place of it. 因此，如果此xpath没有返回值，那么我想存储N/A代替它。 Somewhat like we do here: .extract_first(default="N/A") and i need to use itemloader to concatenate a few xpaths for the same field. 有点像我们在这里做的那样： .extract_first(default="N/A") ，我需要使用itemloader为同一字段连接一些xpath。 Sorry if this is silly , am not very good at scrapy yet. 对不起，这很傻，还不是很擅长拼写。 Thanks. 谢谢。

Answer 1

You can try to add xpath, then check is the field set or not, and then add default value in case field is empty. 您可以尝试添加xpath，然后检查是否设置了该字段，然后在字段为空的情况下添加默认值。 Like here: 像这儿：

il = ItemLoader(item = HomesItem(), response=response)
il.add_xpath('Company_Name', u'...')
if not il.get_output_value('Company_Name'):
    il.add_value('Company_Name', 'N/A')

使用scrapy中的itemloader返回给定xpath找不到的项目的默认响应

问题描述

1 个解决方案

解决方案1
0 2019-04-24 09:10:31

使用scrapy中的itemloader返回给定xpath找不到的项目的默认响应

问题描述

1 个解决方案

解决方案1 0 2019-04-24 09:10:31

解决方案1
0 2019-04-24 09:10:31