简体   繁体   English

如何在没有回调的情况下从scrapy.Request获得响应?

[英]How to get response from scrapy.Request without callback?

I want to send a request and wait for a response from the server in order to perform action-dependent actions.我想发送一个请求并等待来自服务器的响应以执行依赖于动作的操作。 I write the following我写以下

resp = yield scrapy.Request(*kwargs)

and got None in resp.并得到 None 。 In documentation I find that need to use call_back function, but this function call after processing next commands.在文档中我发现需要使用 call_back 函数,但是这个函数在处理下一个命令后调用。 How to wait response from server?如何等待服务器的响应?

I found the inline_requests module which has inline_requests decorator.我找到了具有inline_requests装饰器的inline_requests模块。

It solved my problem.它解决了我的问题。

This isn't really how scrapy should be used, as waiting for a response is the same as using a callback.这不是应该如何使用scrapy,因为等待响应与使用回调相同。 If you need to keep processing previous responses in conjunction with the new one, you can always pass and keep passing the response on the meta argument .如果您需要继续处理先前的响应和新的响应,您可以始终传递并继续传递meta参数上的响应。

Now, to make this sometimes more readable you can also use scrapy-inline-requests which makes exactly the same as explained before under the hood, as it doesn't stop scrapy but makes the following request in order (same as doing a request after another with callbacks).现在,为了使这有时更具可读性,您还可以使用scrapy-inline-requests ,它与之前在引擎盖下解释的完全相同,因为它不会停止scrapy,而是按顺序发出以下请求(与之后执行请求相同)另一个带回调)。

If using scrapy-inline-requests please be careful on making the methods to only be generators and also sending new requests or items when a new inline request is being processed.如果使用scrapy-inline-requests请小心使方法仅作为生成器,并在处理新的内联请求时发送新的请求或项目。

it's not an answer to this question, but is alternative how to get response object and parse it using xpath.这不是这个问题的答案,而是如何获取响应对象并使用 xpath 解析它的替代方法。 Here I use requests, bs4 and lxml libraries.这里我使用了请求、bs4 和 lxml 库。

import requests
from bs4 import BeautifulSoup
from lxml import etree

url = 'your_url'
soup = BeautifulSoup(requests.get(url).text, 'html.parser')
dom = etree.HTML(str(soup))
target_data = dom.xpath("//div......target path......")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM