当unicodecsv.DictReader在Python2.7中解析UTF-8-BOM文件时，如何从第一个字段名中删除引号字符？

Question

The issue is when the class unicodecsv.DictReader parses a CSV file's fields when the fields contain quotes and the file is encoded in UTF-8-BOM, the first field retains the quote characters where all consecutive fields have them properly removed. 问题是当类unicodecsv.DictReader在字段包含引号并且文件以UTF-8-BOM编码时解析CSV文件的字段时，第一个字段保留引号字符，其中所有连续字段都正确删除它们。

Example UTF-8-BOM encoded CSV File: 示例UTF-8-BOM编码的CSV文件：

"Field1","Field2","Field3"
content1,content2,content3

Example Python Code: 示例Python代码：

from unicodecsv import DictReader
filename = "/tmp/test.csv"
with open(filename, mode='r') as read_stream:
     reader = DictReader(read_stream, encoding='utf-8-sig')
     print reader.fieldnames

Print Value: 打印价值：

['"Field1"','Field2','Field3']

Is there a way to have that first field be like the others and have the quote characters removed? 有没有办法让第一个字段与其他字段一样并删除引号字符？

Answer 1

One way is to consume the BOM manually yourself (though I expect the code as written demonstrates an actual bug in the underlying library and should be added to their issues on github ). 一种方法是自己手动使用BOM（虽然我希望编写的代码演示了底层库中的实际错误，并应该添加到github上的问题）。 After consuming the BOM, use the utf-8 codec instead. 使用BOM后，请改用utf-8编解码器。

# My test code to write a file with a BOM
import io
filename = "/tmp/test.csv"
with io.open('test.csv', 'w', encoding='utf-8-sig') as f:
    f.write(u'''\
"Field1","Field2","Field3"
content1,content2,content3
''')

from unicodecsv import DictReader
with open(filename, mode='r') as read_stream:
     # Consume the BOM
     read_stream.read(3)
     reader = DictReader(read_stream, encoding='utf-8')
     print reader.fieldnames

当unicodecsv.DictReader在Python2.7中解析UTF-8-BOM文件时，如何从第一个字段名中删除引号字符？

问题描述

1 个解决方案

解决方案1
0 2017-03-24 04:15:51

当unicodecsv.DictReader在Python2.7中解析UTF-8-BOM文件时，如何从第一个字段名中删除引号字符？

问题描述

1 个解决方案

解决方案1 0 2017-03-24 04:15:51

解决方案1
0 2017-03-24 04:15:51