I have data in a dataframe where i have two observations in one cell:
small medium large
apples 258 0.12% 39 0.0091% 89 0.18%
carrots 97 0.16% 6 0.012% 26 0.26%
bananas 377 0.14% 12 0.018% 128 0.22%
pears 206 0.17% 7 0.034% 116 0.24%
I'd like to create two separate dataframes, to split the observations. Something like this:
small medium large
apples 258 39 89
carrots 97 6 26
bananas 377 12 128
pears 206 7 116
and the second one:
small medium large
apples 0.12% 0.0091% 0.18%
carrots 0.16% 0.012% 0.26%
bananas 0.14% 0.018% 0.22%
pears 0.17% 0.034% 0.24%
I can do the splitting column by column:
new_df1 = df['small'].str.extract('([^\s]+)', expand=True)
new_df2 = df['small'].str.extract('([^\s]*$)', expand=True)
But I can't figure out how to do it for the whole DataFrame. I have many similar dataframes, with different column and row names so I'm looking for a solution that I can reuse. Thanks!
You can do so:
df1 = df.applymap(lambda x: x.split()[0])
df2 = df.applymap(lambda x: x.split()[1])
Example df:
small medium
0 0 33% 0 33%
1 1 44% 1 33%
2 2 55% 1 55%
df1:
small medium
0 0 0
1 1 1
2 2 1
df2:
small medium
0 33% 33%
1 44% 33%
2 55% 55%
Using pd.DataFrame.applymap
and extracting each component via operator.itemgetter
:
from operator import itemgetter
df = pd.DataFrame([['258 0.12%', '39 0.0091%', '89 0.18%'],
['97 0.16%', '6 0.012%', '26 0.26%']],
columns=['small', 'medium', 'large'],
index=['apples', 'carrots'])
split = df.applymap(lambda x: x.split())
df1 = split.applymap(itemgetter(0)).astype(int)
df2 = split.applymap(lambda x: x[1][:-1]).astype(float) / 100
Note you will have to take care to convert strings to int
and float
respectively.
print(df1)
small medium large
apples 258 39 89
carrots 97 6 26
print(df2)
small medium large
apples 0.0012 0.000091 0.0018
carrots 0.0016 0.000120 0.0026
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.