简体   繁体   中英

Pandas Reading csv file with " in the data

I want to parse CSV file but the data look like in the below. While using separator as ," it does not distribute file correctly to the columns. Is there any way to ignore " or escaping with regex?

3,"Gunnar Nielsen Aaby","M",24,NA,NA,"Denmark","DEN" 4,"Edgar Lindenau Aabye","M",34,NA,NA,"Denmark/Sweden" 5,"Christine Jacoba Aaftink","F",21,185,82,"Netherlands" 5,"Christine Jacoba Aaftink","F",21,185,82,"Netherlands" 6,"Per Knut Aaland","M",31,188,75,"United States","USA"

Thanks ins advance

Reading the csv file (assuming no new line between the rows):

with open('data') as f:
    raw = f.readline()

Some splitting and processing:

data = []
for r in raw.split('\" '):
    data.append((r+'"').split(','))

Creating the final dataframe:

df = pd.DataFrame(data)
df

Output:

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM