简体   繁体   English

Python CSV编写器在空的第一行添加引号,但不添加后续行

[英]Python csv writer adds quotes on empty first line, but not subsequent lines

I am trying to use python's csv standard library module to generate comma-separated value (csv) files. 我正在尝试使用python的csv标准库模块来生成逗号分隔值(csv)文件。

It will not allow the first line to be blank. 不允许第一行为空白。 More annoyingly, it treats the first line differently from other lines, so an empty list gives an empty string ("") in one case and a blank line thereafter: 更烦人的是,它将第一行与其他行区别对待,因此一个空列表在一种情况下会给出一个空字符串(“”),然后是一个空行:

import csv
import io

def make_csv(rows):
  with io.StringIO(newline='') as sout:
    writer = csv.writer(sout, quoting=csv.QUOTE_MINIMAL)
    writer.writerows(rows)
    return sout.getvalue()

Given the above definition, an interpreter session might look like: 给定以上定义,解释器会话可能类似于:

>>> make_csv([[''], ['']]) # (only the) first line has quoted empty string
'""\r\n\r\n'

>>> make_csv([['A'], ['A']]) # expected: same input row, same output row
'A\r\nA\r\n'

Why does this quoted empty string happen only on the first line? 为什么用引号引起的空字符串仅出现在第一行? Is there any way I can stop it, or at least get more consistent behavior? 有什么办法可以阻止它,或者至少获得更一致的行为?


Update : this is a bug reported in Dec 2017 as https://bugs.python.org/issue32255 , and resolved by commit https://github.com/python/cpython/commit/2001900b0c02a397d8cf1d776a7cc7fcb2a463e3 , which was included in the 3.6.5 release 更新 :这是2017年12月报告为https://bugs.python.org/issue32255的错误,并通过提交https://github.com/python/cpython/commit/2001900b0c02a397d8cf1d776a7cc7fcb2a463e3解决,该错误已包含在3.6中。 5发布

You can force the csv writer to quote the empty strings by setting a different quoting strategy . 您可以通过设置其他引用策略来强制csv编写器引用空字符串。 Both QUOTE_ALL and QUOTE_NONNUMERIC will do what you want: QUOTE_ALLQUOTE_NONNUMERIC都将执行您想要的操作:

def make_csv(rows):
  with io.StringIO(newline='') as sout:
    writer = csv.writer(sout, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerows(rows)
    return sout.getvalue()
>>> make_csv([[''], ['']])
'""\r\n""\r\n'

I don't know why the default strategy treats the first line differently than other lines, but I believe it's a bug. 我不知道为什么默认策略将第一行与其他行区别对待,但是我认为这是一个错误。 If you try to load the csv data where the 2nd line isn't quoted, you'll notice that the output is different than the input you originally used to create the csv: 如果您尝试在第二行未加引号的地方加载csv数据,则会注意到输出与最初用于创建csv的输入不同:

>>> data = [[''], ['']]
>>> text = make_csv(data)
>>> text
'""\r\n\r\n'
>>> f = io.StringIO(text)
>>> reader = csv.reader(f)
>>> list(reader)
[[''], []]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM