繁体   English   中英

将数据导出到 scrapy 中的单独 csv 文件

[英]exporting data to seperate csv files in scrapy

I have made a scrapy crawler that goes to this site https://www.cartoon3rbi.net/cats.html then by first rule open the link to every show, get its title by parse_title method, and on third rule open every episode's link并得到它的名字。 它工作正常,我只需要知道如何为每个节目的剧集名称制作一个单独的 csv 文件,其中 parse_title 方法中的标题用作 csv 文件的名称。 有什么建议么?

# -*- coding: utf-8 -*-
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule


class FfySpider(CrawlSpider):
    custom_settings = {
        'CONCURRENT_REQUESTS': 1
    }
    name = 'FFy'
    allowed_domains = ['cartoon3rbi.net']
    start_urls = ['https://www.cartoon3rbi.net/cats.html']

    rules = (
        Rule(LinkExtractor(restrict_xpaths='//div[@class="pagination"]/a[last()]'), follow=True),
        Rule(LinkExtractor(restrict_xpaths='//div[@class="cartoon_cat"]'), callback='title_parse', follow=True),
        Rule(LinkExtractor(restrict_xpaths='//div[@class="cartoon_eps_name"]'), callback='parse_item', follow=True),
    )

    def title_parse(self, response):

        title =  response.xpath('//div[@class="sidebar_title"][1]/text()').extract()


    def parse_item(self, response):
        for el in response.xpath('//div[@id="topme"]'):
             yield {
                 'name': el.xpath('//div[@class="block_title"]/text()').extract_first()

             }

假设您将标题存储在列表titles中并将相应的内容存储在列表contents中,您可以每次调用以下自定义 function write_to_csv(title, content)将内容写入文件并以名称<title>.csv

def write_to_csv(title, content=''):
    # if no content is provided, 
    # it creates an empty csv file.
    with open(title+'.csv', 'w') as f:
        f.write(content)

for content, title in zip(contents, titles):
    write_to_csv(title, content)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM