简体   繁体   中英

Python: read_csv into dataframe - can't convert object into float

The csv file I am feeding into read_csv is a couple columns with percentage changes but it has some hidden characters. From repr(data2) :

在此处输入图片说明

I tried the following:

data2 = pd.read_csv('C:/Users/nnayyar/Documents/MonteCarlo2.csv', "\n", delimiter = ",", dtype = float)

And got the following error:

ValueError: invalid literal for float(): 7.05%

I tried a few things:

float(data2.replace('/n',''))
map(float, data2.strip().split('\r\n'))

But received various errors respectively TypeError: float() argument must be a string or a number AttributeError: 'DataFrame' object has no attribute 'strip'

Any help to get the CSV object type into float type would be helpful! THanks!!

If your entire csv has percentage signs then the following will work:

In [203]:
import pandas as pd
import io
t="""0   1   2  3
1.5%  2.5%   6.5%   0.5%"""
# load some dummy data
df = pd.read_csv(io.StringIO(t), delim_whitespace=True)
df

Out[203]:
      0     1     2     3
0  1.5%  2.5%  6.5%  0.5%

In [205]:
# apply a lambda that replaces the % signs and cast to float    
df.apply(lambda x: x.str.replace('%','')).astype(float)

Out[205]:
     0    1    2    3
0  1.5  2.5  6.5  0.5

So this applys a lambda to each column that calls the vectorised str.replace to remove the % sign, we can then convert the type to float using astype

So in your case the following should work:

data2 = pd.read_csv('C:/Users/nnayyar/Documents/MonteCarlo2.csv', "\n")
data2 = data2.apply(lambda x: x.str.replace('%', '').astype(float))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM