[英]Scrapy - How to export a cvs file with item key in header
I don't want to use the -o command to export csv but create it from my scrapy script. 我不想使用-o命令导出csv,而是从我的scrapy脚本创建它。 My csv file does export well with items, but I don't have the header.
我的csv文件可以很好地与项目一起导出,但是我没有标题。 I would like to have a header whom correspond to my items' keys.
我想要一个与我的物品的钥匙相对应的标题。
I saw in several forums and tutorials that header has to be defined in pipelines.py. 我在几个论坛和教程中看到,必须在pipelines.py中定义标头。 I tried different solutions with open_spiders but it didn't work.
我用open_spiders尝试了不同的解决方案,但是没有用。
Here is my pipelines.py code : 这是我的pipelines.py代码:
class CsvWriterPipeline(object):
def __init__(self):
self.csvwriter = csv.writer(open(fichier1, 'wb'))
def open_spider(self, spider):
header_keys = item.fields.keys()
self.csvwriter.writerow(header_keys)
def process_item(self, item, spider):
self.csvwriter.writerow(
[item['nom_course'][0],
item['nom_evenement'][0],
item['distance'][0],
item['date'][0],
item['contact_1'][0],
item['contact_2'][0],
item['organisateur'][0],
item['site_internet_evenement'][0],
item['description'][0],
item['prix'][0],
item['nb_participant'][0],
item['URL_Even'][0],
item['pays'][0],
item['region'][0],
item['ville'][0],
item['tag'][0]])
return item
settings.py settings.py
BOT_NAME = 'AHOTU_V2'
SPIDER_MODULES = ['AHOTU_V2.spiders']
NEWSPIDER_MODULE = 'AHOTU_V2.spiders'
ITEM_PIPELINES = {
'AHOTU_V2.pipelines.CsvWriterPipeline': 800,
}
ROBOTSTXT_OBEY = True
When you open your spider there is no item at all. 当您打开蜘蛛时,根本没有物品。 So below function doesn't work
所以下面的功能不起作用
def open_spider(self, spider):
header_keys = item.fields.keys()
self.csvwriter.writerow(header_keys)
What you should rather do is have a field to check if headers are written or not 您宁愿做的是有一个字段来检查标题是否被写入
class CsvWriterPipeline(object):
def __init__(self):
self.csvwriter = None
self.headers_written = False
def open_spider(self, spider):
self.csvwriter = csv.writer(open(fichier1, 'wb'))
def process_item(self, item, spider):
if not self.headers_written:
header_keys = item.fields.keys()
self.csvwriter.writerow(header_keys)
self.headers_written = True
self.csvwriter.writerow(
[item['nom_course'][0],
item['nom_evenement'][0],
item['distance'][0],
item['date'][0],
item['contact_1'][0],
item['contact_2'][0],
item['organisateur'][0],
item['site_internet_evenement'][0],
item['description'][0],
item['prix'][0],
item['nb_participant'][0],
item['URL_Even'][0],
item['pays'][0],
item['region'][0],
item['ville'][0],
item['tag'][0]])
return item
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.