I am using below line to read a csv file where column B ends as str format and I do not manage to convert it to float directly:
df = pd.read_csv('data.csv', sep=";", encoding = "ISO-8859-1")
this produces a dataframe where all columns are in str format:
A B
0 Emma -20,50
1 Filo -15,75
2 Theo 17,23
As you may notice the decimals are separated by ',' instead of '.' because it's German csv. I tried the following already (to no avail):
..., dtype={'B': np.float32}, decimal= ',' , ....
Any idea how I could get it done in the reading process?
Amending after reading the csv is working (but this is an inefficient additional step I would like to avoid), this is what I use:
df['B'] = df['B'].str.replace(',', '.').astype(float)
For me it works nice, I only omit dtype={'B': np.float32}
:
import pandas as pd
import io
temp=u"""A;B
0;Emma;-20,50
1;Filo;-15,75
2;Theo;17,23"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=";", encoding = "ISO-8859-1", decimal= ',')
print (df)
A B
0 Emma -20.50
1 Filo -15.75
2 Theo 17.23
print (df.dtypes)
A object
B float64
dtype: object
EDIT:
I think problem can be some decimals are .
and some ,
, then use converters
:
import pandas as pd
import io
temp=u"""A;B
0;Emma;-20,50
1;Filo;-15.75
2;Theo;17,23"""
def converter(x):
return float(x.replace(',','.'))
#define each column
converters={'B': converter}
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
sep=";",
encoding = "ISO-8859-1",
converters=converters)
print (df)
0 Emma -20.50
1 Filo -15.75
2 Theo 17.23
print (df.dtypes)
A object
B float64
dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.