简体   繁体   中英

pandas convert strings to float for multiple columns in dataframe

I'm new to pandas and trying to figure out how to convert multiple columns which are formatted as strings to float64's. Currently I'm doing the below, but it seems like apply() or applymap() should be able to accomplish this task even more efficiently...unfortunately I'm a bit too much of a rookie to figure out how. Currently the values are percentages formatted as strings like '15.5%'

for column in ['field1', 'field2', 'field3']:
    data[column] = data[column].str.rstrip('%').astype('float64') / 100

Starting in 0.11.1 (coming out this week), replace has a new option to replace with a regex, so this becomes possible

In [14]: df = DataFrame('10.0%',index=range(100),columns=range(10))

In [15]: df.replace('%','',regex=True).astype('float')/100
Out[15]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100 entries, 0 to 99
Data columns (total 10 columns):
0    100  non-null values
1    100  non-null values
2    100  non-null values
3    100  non-null values
4    100  non-null values
5    100  non-null values
6    100  non-null values
7    100  non-null values
8    100  non-null values
9    100  non-null values
dtypes: float64(10)

And a bit faster

In [16]: %timeit df.replace('%','',regex=True).astype('float')/100
1000 loops, best of 3: 1.16 ms per loop

 In [18]: %timeit df.applymap(lambda x: float(x[:-1]))/100
1000 loops, best of 3: 1.67 ms per loop
df.applymap(lambda x:float(x.rstrip('%'))/100)

在接受的答案中回答评论:对于特定的列,请确保不要在原地进行。

df['Column1'] = df['Column1'].replace('%','',regex=True).astype('float')/100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM