how to keep pd.read_csv() from changing datatype

Question

I save a DataFrame using to_csv(), then retrieve it with csv_read() and it comes back with a different datatype.

df1i=['00','01']
df1=pd.DataFrame( columns=['00','01'],index=df1i)
df1
df1.iloc[0,0]=([11, 22])
df1
print((df1.iloc[0,0]))
print(type(df1.iloc[0,0]))
print(df1.iloc[0,0][0])
print(type(df1.iloc[0,0][0]))
df1.to_csv('C:\Thomas\paradigm\\df1.csv') 
df1=pd.read_csv('c:\\Thomas\\paradigm\\df1.csv',index_col=0)
print((df1.iloc[0,0]))
print(type(df1.iloc[0,0]))
print(df1.iloc[0,0][0])
print(type(df1.iloc[0,0][0]))

[11, 22]
<class 'list'>
11
<class 'int'>
[11, 22]
<class 'str'>
[
<class 'str'>

If anyone can tell me how to control this I will appreciate it.

Clarification of my question>>>

What I am asking is if there is a way to get back the same type that you put in. For instance, if I input the element as an integer, it comes back as a string, an ndarray in also comes back as a string.

Answer 1

It is a bit weird to store lists as elements of a DataFrame, but if it is what you need to do then consider using a converter along with ast.literal_eval in order to get a list back.

import pandas as pd
import ast


df1i = ['00', '01']
df1=pd.DataFrame( columns=['00','01'],index=df1i)
df1.iloc[0,0]=([11, 22])

print((df1.iloc[0,0]))
print(type(df1.iloc[0,0]))
print(df1.iloc[0,0][0])
print(type(df1.iloc[0,0][0]))

df1.to_csv('df1.csv')

df1=pd.read_csv('df1.csv', index_col=0, converters={'00': lambda x: ast.literal_eval(str(x)) if len(str(x)) > 0 else x})

print((df1.iloc[0,0]))
print(type(df1.iloc[0,0]))
print(df1.iloc[0,0][0])
print(type(df1.iloc[0,0][0]))

how to keep pd.read_csv() from changing datatype

Question

1 answers

solution1
1 ACCPTED 2016-09-27 20:47:13

how to keep pd.read_csv() from changing datatype

Question

1 answers

solution1 1 ACCPTED 2016-09-27 20:47:13

solution1
1 ACCPTED 2016-09-27 20:47:13