简体   繁体   中英

How to read csv file where columns are separated by double quotes and spaces

I tried googling and searching stackoverflow already, but most similar answers involve some other character acting as a separator (like comma or |) but in the csv I have, a line of data looks like this:

"2017-02-27 ""2017-02-25"" ""15438"" ""2017-02-27",19,"671"" ""1"" ""14"" ""John Smith"" ""614""

And each value is meant to be a column (so above would be 8 columns). Another problem is the value 2017-02-27",19,"671 is all in one column, which includes single quote marks and commas.

So it seems like the delimiter is this: "" ""

How can I read this in properly?

Also, as a side question, the headers are also listed as the first row of the csv file, but they are separated with just spaces (with the headers themselves using underscores such as: name_1 name_2 name_3). Is there a way to read this in while using read_csv or easier to just copy that row and paste it in to the name parameter as a list?

Thanks!

Edit: I already tried sep='"" ""' which returns everything as one column. Here is everything I tried (taken from other stackoverflow threads):

sep='"" ""'
sep=',\s+',quoting=csv.QUOTE_ALL
sep=" ", quotechar="~"
sep='["]* ["]*', engine='python'

If I take your data as you have it and place in a csv file, and run this

df = pd.read_csv('test.csv', header=None, sep='\s', engine='python').replace('"','', regex=True)
df

I get

            0           1      2                  3  4   5     6      7    8
0  2017-02-27  2017-02-25  15438  2017-02-27,19,671  1  14  John  Smith  614

Then split the column in question:

df[['n1', 'n2', 'n3']] = df.loc[:, 3].str.split(',', expand=True)

            0           1      2                  3  4   5     6      7    8          n1  n2   n3
0  2017-02-27  2017-02-25  15438  2017-02-27,19,671  1  14  John  Smith  614  2017-02-27  19  671

If this isn't the result your looking for, please comment.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM