简体   繁体   中英

Split columns in dataframe into two new dataframes

I have data in a dataframe where i have two observations in one cell:

                          small             medium        large
apples                258 0.12%         39 0.0091%     89 0.18%
carrots                97 0.16%          6  0.012%     26 0.26%
bananas               377 0.14%         12  0.018%    128 0.22%
pears                 206 0.17%          7  0.034%    116 0.24%

I'd like to create two separate dataframes, to split the observations. Something like this:

                    small           medium          large
apples                258               39             89
carrots                97                6             26
bananas               377               12            128
pears                 206                7            116

and the second one:

                      small             medium        large
apples                0.12%            0.0091%        0.18%
carrots               0.16%             0.012%        0.26%
bananas               0.14%             0.018%        0.22%
pears                 0.17%             0.034%        0.24%

I can do the splitting column by column:

 new_df1 = df['small'].str.extract('([^\s]+)', expand=True)
 new_df2 = df['small'].str.extract('([^\s]*$)', expand=True)

But I can't figure out how to do it for the whole DataFrame. I have many similar dataframes, with different column and row names so I'm looking for a solution that I can reuse. Thanks!

You can do so:

df1 = df.applymap(lambda x: x.split()[0])
df2 = df.applymap(lambda x: x.split()[1])

Example df:

   small medium
0  0 33%  0 33%
1  1 44%  1 33%
2  2 55%  1 55%

df1:

 small medium
0  0   0
1  1   1
2  2   1

df2:

  small medium
0  33%  33%
1  44%  33%
2  55%  55%

Using pd.DataFrame.applymap and extracting each component via operator.itemgetter :

from operator import itemgetter

df = pd.DataFrame([['258 0.12%', '39 0.0091%', '89 0.18%'],
                   ['97 0.16%', '6  0.012%', '26 0.26%']],
                  columns=['small', 'medium', 'large'],
                  index=['apples', 'carrots'])

split = df.applymap(lambda x: x.split())

df1 = split.applymap(itemgetter(0)).astype(int)
df2 = split.applymap(lambda x: x[1][:-1]).astype(float) / 100

Note you will have to take care to convert strings to int and float respectively.

print(df1)

         small  medium  large
apples     258      39     89
carrots     97       6     26

print(df2)

          small    medium   large
apples   0.0012  0.000091  0.0018
carrots  0.0016  0.000120  0.0026

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM