简体   繁体   English

Python CSV:嵌套双引号

[英]Python CSV: nested double quotes

I have a test.csv file as follows:我有一个test.csv文件如下:

"N";"INFO"
"1";"<a href="www.google.it">www.google.it</a>"

I use the following program to print out the contents of the CSV file我用下面的程序打印出CSV文件的内容

import csv
with open('test.csv', newline='') as csvfile:
    reader=csv.DictReader(csvfile, delimiter=';')
    for p in reader:
        print("%s %s" % (p['N'], p['INFO']))

The output is output 是

1 <a href=www.google.it">www.google.it</a>"

The reason lies probably in the fact that the csv file has some "nested" double quotes.原因可能在于 csv 文件有一些“嵌套”双引号。 However, the separating character is ";", and so I would like the library to simply remove the double quote " at the beginning and at the end of the field INFO, keeping the rest of the string intact.但是,分隔字符是“;”,所以我希望库简单地删除字段 INFO 开头和结尾的双引号“,保持字符串的 rest 完好无损。

In other words, I would like the output of the program to be换句话说,我希望程序的 output 是

1 <a href="www.google.it">www.google.it</a>

How can I fix that, without modifying the test.csv file?如何在不修改test.csv文件的情况下解决这个问题?

One possibility is to use the csv module with csv.QUOTE_NONE , then handle the removal of the quotes (on both the fieldnames and the values) manually:一种可能性是将csv模块与csv.QUOTE_NONE一起使用,然后手动删除引号(在字段名和值上):

import csv

def strip_outer_quotes(s):
    """ Strip an outer pair of quotes (only) from a string. If not quoted,
    string is returned unchanged. """
    if s[0] == s[-1] == '"':
        return s[1:-1]
    else:
        return s

def my_csv_reader(fh):
    """ Thin wrapper around csv.DictReader to handle fields which are
    quoted but contain unquoted " characters. """
    reader = csv.DictReader(fh, delimiter=';', quoting=csv.QUOTE_NONE)
    reader.fieldnames = [strip_outer_quotes(fn) for fn in reader.fieldnames]
    for row in reader:
        yield {k: strip_outer_quotes(v) for k, v in row.items()}

with open('test.csv', newline='') as csvfile:
    reader = my_csv_reader(csvfile)
    for p in reader:
        print("%s %s" % (p['N'], p['INFO'])) 

Note: instead of my_csv_reader , probably name the function after the source of this particular variant of CSV;注意:而不是my_csv_reader ,可能在 CSV 这个特定变体的来源之后命名 function; acme_csv_reader or similar acme_csv_reader或类似的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM