How to set a space as separator for to_csv()? - python

Question

I need to save a dataframe as a csv knowing that I need to read that csv file with a delim_whitespace=True option in another script?

Here's an example of what I am trying to do -> The dataframe df I'm working with is the following:

    id  var1        var2
0   0   0.000000    0.000000
1   1   0.000000    0.000000
2   2   0.000000    0.000000

I want to save it with a delim_whitespace as a delimiter so I tried:

df.to_csv('df_file.csv', delim_whitespace=True) #does not work
df.to_csv('df_file.csv', sep=r"\s+")            #cannot be opened with a pd.read_csv('df_file.csv', delim_whitespace=True)
df.to_csv('df_file.csv', sep='\t')              #cannot be opened with a pd.read_csv('df_file.csv', delim_whitespace=True)
df.to_csv('df_file.csv', sep=' ')               #cannot be opened with a pd.read_csv('df_file.csv', delim_whitespace=True)
df.to_csv('df_file.csv', sep='    ')            #cannot be saved because sep needs one character apparently

What separator can I use so I can then read that file with the delim_whitespace=True option?

Answer 1

Here is a full save/read example:

Sample data:

import pandas as pd
d = {'id': {0: 0, 1: 1, 2: 2}, 'var1': {0: 0.0, 1: 0.0, 2: 0.0}, 'var2': {0: 0.0, 1: 0.0, 2: 0.0}}
df_save = pd.DataFrame(data=d)

Code:

index=False otherwise after loading the index will be added as another separate column.

p = r'C:\test.csv'
df_save.to_csv(p, sep=' ', index=False)
df_read = pd.read_csv(p, sep=' ')

Output:

   id  var1  var2
0   0   0.0   0.0
1   1   0.0   0.0
2   2   0.0   0.0

If you expirience the error: ParserError: Error tokenizing data. C error: Expected 66 fields in line 16080, saw 67 ParserError: Error tokenizing data. C error: Expected 66 fields in line 16080, saw 67

This means you have at least in that line one more whitespace than there should be. You can now either inspect the file with some reader, eg Pycharm or even Excel and clean that line.

Or you can simply skip bad lines like this:

df = pd.read_csv('df_file.csv', error_bad_lines=False)

Answer 2

Try using:

df.to_csv("output.csv",sep=' ')

to save the file.

To read the file:

df=pd.read_csv("output.csv",sep=' ')

You will get 'Unnamed: 0' as a column name, To deal with that just run:

df.drop(columns=['Unnamed: 0'],inplace=True)

Answer 3

First of all you can't use delim_whitespace in to_csv. Check the documnetation

entries=[[0,1,2],[0.,0.,0.],[0.,0.,0.]]
df=pd.DataFrame(dictt,columns=['id', 'var1', 'var2'])
df

Output:
    id  var1    var2
0   0.0     1.0     2.0
1   0.0     0.0     0.0
2   0.0     0.0     0.0

Save using sep=' ' and checking the resulting file with cat.

df.to_csv('temp.csv',sep=' ')
!cat tt.csv

Output:
 id var1 var2
0 0 0.0 0.0
1 1 0.0 0.0
2 2 0.0 0.0

Can read it then using the delim_whitespace=True

pd.read_csv('temp.csv',delim_whitespace=True)

Output:
    id  var1    var2
0   0.0     1.0     2.0
1   0.0     0.0     0.0
2   0.0     0.0     0.0

How to set a space as separator for to_csv()? - python

Question

3 answers

solution1
2 ACCPTED 2021-07-13 09:41:51

solution2
0 2021-07-13 09:37:43

solution3
0 2021-07-13 09:41:52

How to set a space as separator for to_csv()? - python

Question

3 answers

solution1 2 ACCPTED 2021-07-13 09:41:51

solution2 0 2021-07-13 09:37:43

solution3 0 2021-07-13 09:41:52

solution1
2 ACCPTED 2021-07-13 09:41:51

solution2
0 2021-07-13 09:37:43

solution3
0 2021-07-13 09:41:52