For exporting my data to a CSV file, I'm currently using (mainly because I never understood pipelines that well):
custom_settings = {
'FEED_FORMAT': 'csv',
'FEED_URI' : 'datosAmazon.csv'
}
This custom settings are inside my spider.
Right now, I'm scraping different categories of items, for example, laptops and cell phones.
Problem is that, when I go check out my data, things are not organized, maybe a laptop appears, then a cell phone, then 2 laptops, cellphone and so on.
I'm currently going into different categories this way
def start_requests(self):
keywords = ['laptop', 'cellphone']
for keyword in keywords:
yield Request(self.search_url.format(keyword))
Is it there a way for the data to be more organized (2 files would be even better), or an easy pipeline solution.
There is no settings-only way to achieve what you want.
That said, exporting to multiple files from a custom pipeline is pretty straight-forward:
scrapy.exporters.CSVItemExporter
) in the open_spider
method (probably store them in a dict) process_item
method and call its export_item
close_spider
method Don't forget to activate your pipeline :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.