简体   繁体   中英

Export scrapy objects into one file per item

I'm using scrapy to get contents of some webpages. Is there a way to configure scrapy so that it exports each dataline into a separate file?

You can yield items in your spider to return multiple items to be processed in your pipeline.

class SomeSpider(Spider):

  ...

  def parse(self, response):
    # some code to parse the webpage

    for some_line in webpage:
        item = YourItem()
        # parse items

        yield item

This will return multiple items for one scraped page. Then just specify your pipeline to write each item to a separate file.

class SomePipeline(object):

  ...      

  def process_item(self, item, spider):
      with open('file.txt', 'w') as f:

          # format your item into a line here

          f.write(line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM