I'm using scrapy to get contents of some webpages. Is there a way to configure scrapy so that it exports each dataline into a separate file?
You can yield items in your spider to return multiple items to be processed in your pipeline.
class SomeSpider(Spider):
...
def parse(self, response):
# some code to parse the webpage
for some_line in webpage:
item = YourItem()
# parse items
yield item
This will return multiple items for one scraped page. Then just specify your pipeline to write each item to a separate file.
class SomePipeline(object):
...
def process_item(self, item, spider):
with open('file.txt', 'w') as f:
# format your item into a line here
f.write(line)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.