[英]Best Method to Push Data from Scrapy to .Net Application
Best Method/Idea to push scraped data from Scrapy crawlers to .Net Application将抓取的数据从 Scrapy 爬虫推送到 .Net 应用程序的最佳方法/想法
Setup:设置:
I am thinking about adding a RESTful API to my .Net Core Service and push item data there from Scrapy on every crawler "finished" event.我正在考虑向我的 .Net Core 服务添加一个 RESTful API,并在每个爬虫“完成”事件上从 Scrapy 推送项目数据。
Basically I want to have kind of "push-notifications" from Scrapy server to my .Net app when new data item is scraped.基本上,当新数据项被抓取时,我希望有一种从 Scrapy 服务器到我的 .Net 应用程序的“推送通知”。
What is the best place to put that call to an external API in scrapy?在scrapy中调用外部API的最佳位置是什么?
You have multiple options here.您在这里有多种选择。 Pushing data is indeed the easiest solution, though make sure to authorize the requests you make to your API.
推送数据确实是最简单的解决方案,但请确保授权您向 API 发出的请求。 You can use the item_scraped signal to invoke your requests for every scraped item.
您可以使用item_scraped 信号来调用您对每个已抓取项目的请求。 Keep in mind that in case there are hundreds of scraped items, it might put a lot of stress on your API which is something you should avoid.
请记住,如果有数百个被抓取的项目,它可能会给您的 API 带来很大压力,这是您应该避免的。 You can wait until the scraper has finished and then invoke your API with a single request.
您可以等到刮板完成,然后通过单个请求调用您的 API。 Some alternative solutions:
一些替代解决方案:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.