How to handle double quotes inside field values with csv module?

Question

I'm trying to parse CSV files from an external system which I have no control of.

comma is used as a separator
when cell contains comma then it's wrapped in quotes and all other quotes are escaped with another quote character.
(my problem) when cell was not wrapped in quotes then all quote characters are escaped with another quote nonetheless.

Example CSV:

qw""erty,"a""b""c""d,ef""""g"

Should be parsed as:

[['qw"erty', 'a"b"c"d,ef""g']]

However, I think that Python's csv module does not expect quote characters to be escaped when cell was not wrapped in quote chars in the first place. csv.reader(my_file) (with default doublequote=True ) returns:

['qw""erty', 'a"b"c"d,ef""g']

Is there any way to parse this with python csv module ?

Answer 1

Following on @JackManey comment where he suggested to replace all instances of '""' inside of double quotes with '\\\\"' .

Recognizing if we are currently inside of double quoted cells turned out to be unnecessary and we can replace all instances of '""' with '\\\\"' . Python documentation says :

On reading, the escapechar removes any special meaning from the following character

However this would still break in the case where original cell already contains escape characters, example: 'qw\\\\\\\\""erty' producing [['qw\\\\"erty']] . So we have to escape the escape characters before parsing too.

Final solution:

with open(file_path, 'rb') as f:
  content = f.read().replace('\\', '\\\\').replace('""', '\\"')
  reader = csv.reader(StringIO(content), doublequote=False, escapechar='\\')
  return [row for row in reader]

Answer 2

就像@JackManey建议的那样，在读取文件后，您可以将单引号替换为双引号。

my_file_onequote = [col.replace('""', '"') for col in row for row in my_file]

How to handle double quotes inside field values with csv module?

Question

2 answers

solution1
5 ACCPTED 2015-02-26 14:55:37

solution2
0 2015-02-25 17:49:51

How to handle double quotes inside field values with csv module?

Question

2 answers

solution1 5 ACCPTED 2015-02-26 14:55:37

solution2 0 2015-02-25 17:49:51

solution1
5 ACCPTED 2015-02-26 14:55:37

solution2
0 2015-02-25 17:49:51