简体   繁体   中英

Scrapy: Pass variable from Middleware to Spider itself

I am trying to capture the raw request payload and request headers for the purpose of tracking inside my database. I know about the response.request.headers but that is the returned request headers.

Would it be possible to create a middleware that captures the request.headers and payload (body) and send that to the spider as a meta tag or anything like that?

I found a way to do it (granted without the middleware):

  • Store the scrapy.Request() into a variable eg req
  • Store req.headers.to_unicode_dict() into self.req_headers
  • Store req.body into self.req_body
  • Execute yield req to send the request

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM