简体   繁体   中英

Pandas groupby with new column for each value

I hope the title speaks for itself; I'd just like to add that it can be assumed that each key has the same amount of values. Online searching the title yielded the following solution:

Split pandas dataframe based on groupby

Which supposed to be solving my problem, although it does not. I'll give an example:

Input:

pd.DataFrame(data={'a':['foo','foo','foo','bar','bar','bar'],'b':[1,2,3,4,5,6]})

Output:

pd.DataFrame(data={'a':['foo','bar'],'b':[1,4],'c':[2,5],'d':[3,6]})

Intuitively, it would be a groupby function without an aggregation function, or an aggregation function that makes a list out of the keys.

Obviously, it can be done 'manually' using for loops etc., but using for loops with large data sets is very expensive computationally.

Use GroupBy.cumcount for Series or column g , then reshape by DataFrame.set_index + Series.unstack or DataFrame.pivot , last data cleaning by DataFrame.add_prefix , DataFrame.rename_axis with DataFrame.reset_index :

g = df1.groupby('a').cumcount()
df = (df1.set_index(['a', g])['b']
         .unstack()
         .add_prefix('new_')
         .reset_index()
         .rename_axis(None, axis=1))
print (df)
     a  new_0  new_1  new_2
0  bar      4      5      6
1  foo      1      2      3

Or:

df1['g'] = df1.groupby('a').cumcount()
df = df1.pivot('a','g','b').add_prefix('new_').reset_index().rename_axis(None, axis=1)
print (df)
     a  new_0  new_1  new_2
0  bar      4      5      6
1  foo      1      2      3

Here is an alternative approach, using groupby.apply and string.ascii_lowercase if column names are important:

from string import ascii_lowercase

df = pd.DataFrame(data={'a':['foo','foo','foo','bar','bar','bar'],'b':[1,2,3,4,5,6]})

# Groupby 'a'
g = df.groupby('a')['b'].apply(list)

# Construct new DataFrame from g
new_df = pd.DataFrame(g.values.tolist(), index=g.index).reset_index()

# Fix column names
new_df.columns = [x for x in ascii_lowercase[:new_df.shape[1]]]

print(new_df)

     a  b  c  d
0  bar  4  5  6
1  foo  1  2  3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM