简体   繁体   English

将额外的参数传递给scrapy.Request()

[英]Passing extra arguments to scrapy.Request()

Actually I want to store all the data(text,hrefs,images) related to specific website to a single folder.In order to do that I need to pass the path for that folder to all different parsing function.So I want to pass this path as extra kwargs in scrapy.Request() like this:实际上我想将与特定网站相关的所有数据(文本、hrefs、图像)存储到一个文件夹中。为了做到这一点,我需要将该文件夹的路径传递给所有不同的解析函数。所以我想传递这个在scrapy.Request()作为额外 kwargs 的路径,如下所示:

yield scrapy.Request(url=url,dont_filter=True, callback=self.parse,errback = self.errback_function,kwargs={'path': '/path/to_folder'})

But it gives the error TypeError: __init__() got an unexpected keyword argument 'kwargs'但它给出了错误TypeError: __init__() got an unexpected keyword argument 'kwargs'

How can I pass that path to next function?如何将该路径传递给下一个函数?

For anyone who may need it......对于任何可能需要它的人......

You can pass extra arguments by using meta arguments like this...您可以使用这样的meta参数传递额外的参数......

   yield scrapy.Request(url=url,dont_filter=True, 
callback=self.parse,errback = self.errback_function,  meta={'filepath': filepath})

UPDATE:更新:

Request.cb_kwargs was introduced in version 1.7. Request.cb_kwargs是在 1.7 版中引入的。 Prior to that, using Request.meta was recommended for passing information around callbacks.在此之前,建议使用 Request.meta 来传递有关回调的信息。 After 1.7, Request.cb_kwargs became the preferred way for handling user information, leaving Request.meta for communication with components like middlewares and extensions.在 1.7 之后,Request.cb_kwargs 成为处理用户信息的首选方式,让 Request.meta 用于与中间件和扩展等组件进行通信。

So for version >= 1.7 following would work :因此对于版本 >= 1.7 以下将起作用:

 request = scrapy.Request('http://www.example.com/index.html', callback=self.parse_page2, cb_kwargs=dict(main_url=response.url))

you can refer to this documentation: https://doc.scrapy.org/en/latest/topics/request-response.html#passing-additional-data-to-callback-functions你可以参考这个文档: https : //doc.scrapy.org/en/latest/topics/request-response.html#passing-additional-data-to-callback-functions

It is an old topic, but for anyone who needs it, to pass an extra parameter you must use cb_kwargs , then call the parameter in the parse method.这是一个古老的话题,但对于任何需要它的人来说,要传递额外的参数,您必须使用cb_kwargs ,然后在 parse 方法中调用该参数。

You can refer to this part of the documentation.你可以参考这部分文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM