[英]Scrapy yield comma separated csv file
I have made a program to extract materials online as follow. 我制作了一个程序,用于按以下方式在线提取材料。 It works and do generate csv file.
它可以工作并生成csv文件。 However, the data seems not to be comma separated as seen in excel file.
但是,数据似乎不是用逗号分隔,如excel文件所示。 How can I fix that to make the file to be comma separated?
如何解决该问题,使文件以逗号分隔?
import scrapy
class JPItem(scrapy.Item):
question_title = scrapy.Field()
question_content = scrapy.Field()
question_link = scrapy.Field()
best_answer = scrapy.Field()
best_answer_link = scrapy.Field()
class JPSpider(scrapy.Spider):
name = "jp"
allowed_domains = ['detail.chiebukuro.yahoo.co.jp']
start_urls = [
'https://detail.chiebukuro.yahoo.co.jp/qa/question_detail/q' + str(x)
for x in range (10000000000,100000000000)
]
def parse(self, response):
item = JPItem()
item['question_title'] = response.css("div.mdPstd.mdPstdQstn.sttsRslvd.clrfx div.ttl h1::text").extract_first()
item['question_content'] = ''.join([i for i in response.css("div.mdPstdQstn div.ptsQes p::text").extract()])
item['question_link'] = ''.join(response.css("div.mdPstdQstn p:not([class]) a::text").extract())
item['best_answer'] = ''.join([i for i in response.css("div.mdPstdBA div.ptsQes p.queTxt::text").extract()])
item['best_answer_link'] = ''.join(response.css("div.mdPstdBA p:not([class]) a::text").extract())
yield item
Every item
property returns as a list, which is why they look comma-separated in your file. 每个
item
属性都以列表形式返回,这就是为什么它们在文件中看起来用逗号分隔的原因。 However, the last four item properties you're dealing with won't be lists, because you're using ''.join()
on them. 但是,您要处理的最后四个项目属性不会是列表,因为您在它们上使用了
''.join()
。 And if you want each list item to populate its own cell in a csv file in Excel, you'll need to iterate through your lists and yield
each one separately. 并且,如果希望每个列表项在Excel中的csv文件中填充其自己的单元格,则需要遍历列表并分别
yield
每个列表项。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.