简体   繁体   中英

How to read CSV file ignoring commas between quotes with Pandas

I've recently started to use Pandas.

Here's my csv file.

column1,column2,column3
a, b, c
a, b, "c, d"

I want "c, d" to be in column3 like here:

Column1 Column2 Column3
a b c
a b c, d

But using data = pd.read_csv('testfile.csv', sep=',', quotechar='"', encoding='utf-8') I get this table instead:

Column1 Column2 Column3
a b c
a, b, "c, d" None None

I've tried to change values of some of parameters in read_csv . And also regular expression from here .

You might try

data = pd.read_csv('testfile.csv', sep=',', quotechar='"',
                   skipinitialspace=True, encoding='utf-8')

which tells pandas to ignore the space that comes after the comma, otherwise it can't recognize the quote.

EDIT : Apparently this does not work for the author of the question

Therefore, this is a script that produces the wanted result. I have python 3.8.9, pandas 1.2.3.

itworks.py

import pandas as pd

with open("testfile.csv", "w") as f:
    f.write("""column1,column2,column3
a, b, c
a, c, "c, d"
""")

data = pd.read_csv("testfile.csv", sep=",", quotechar='"', skipinitialspace=True, encoding="utf-8")
print(data)
$ python itworks.py
  column1 column2 column3
0       a       b       c
1       a       c    c, d
$

Try to reproduce this minimal example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM