Scrapy-如何在S3和本地文件系统中同时保存json文件

Question

I'm already have my settings set to save the json in a S3 bucket. 我已经设置好了将json保存在S3存储桶中的设置。 But I want to save also in my local machine, if this is possible. 但是，如果可能的话，我也想保存在本地计算机上。

I tried the config below, but Scrapy save only in the local machine. 我尝试了以下配置，但Scrapy仅保存在本地计算机中。

FEED_URI = 's3://bucket/scraped/file.jl'
FEED_URI = 'file:///tmp/file.jl'

I dont understand very well the settings explained by the Scrapy docs here 我不太了解Scrapy文档在此处解释的设置

Answer 1

Scrapy's feed exports extension doesn't support sending items to two places at the time. Scrapy的Feed导出扩展程序不支持将商品同时发送到两个地方。

FEED_URI in your settings is just a Python variable, so the reason it's only saving in your local machine is because it's being overwritten the second time. 设置中的FEED_URI只是一个Python变量，因此仅将其保存在本地计算机中的原因是因为它第二次被覆盖。

You can work around that by using FEED_URI to send items to S3 and writing a pipeline that saves your items locally. 您可以通过使用FEED_URI将项目发送到S3并编写一个将项目保存在本地的管道来解决此问题。

Scrapy-如何在S3和本地文件系统中同时保存json文件

问题描述

1 个解决方案

解决方案1
3 已采纳 2015-10-20 23:19:55

Scrapy-如何在S3和本地文件系统中同时保存json文件

问题描述

1 个解决方案

解决方案1 3 已采纳 2015-10-20 23:19:55

解决方案1
3 已采纳 2015-10-20 23:19:55