I hope the title speaks for itself; I'd just like to add that it can be assumed that each key has the same amount of values. Online searching the title yielded the following solution:
Split pandas dataframe based on groupby
Which supposed to be solving my problem, although it does not. I'll give an example:
Input:
pd.DataFrame(data={'a':['foo','foo','foo','bar','bar','bar'],'b':[1,2,3,4,5,6]})
Output:
pd.DataFrame(data={'a':['foo','bar'],'b':[1,4],'c':[2,5],'d':[3,6]})
Intuitively, it would be a groupby
function without an aggregation function, or an aggregation function that makes a list out of the keys.
Obviously, it can be done 'manually' using for loops etc., but using for loops with large data sets is very expensive computationally.
Use GroupBy.cumcount
for Series
or column g
, then reshape by DataFrame.set_index
+ Series.unstack
or DataFrame.pivot
, last data cleaning by DataFrame.add_prefix
, DataFrame.rename_axis
with DataFrame.reset_index
:
g = df1.groupby('a').cumcount()
df = (df1.set_index(['a', g])['b']
.unstack()
.add_prefix('new_')
.reset_index()
.rename_axis(None, axis=1))
print (df)
a new_0 new_1 new_2
0 bar 4 5 6
1 foo 1 2 3
Or:
df1['g'] = df1.groupby('a').cumcount()
df = df1.pivot('a','g','b').add_prefix('new_').reset_index().rename_axis(None, axis=1)
print (df)
a new_0 new_1 new_2
0 bar 4 5 6
1 foo 1 2 3
Here is an alternative approach, using groupby.apply
and string.ascii_lowercase
if column names are important:
from string import ascii_lowercase
df = pd.DataFrame(data={'a':['foo','foo','foo','bar','bar','bar'],'b':[1,2,3,4,5,6]})
# Groupby 'a'
g = df.groupby('a')['b'].apply(list)
# Construct new DataFrame from g
new_df = pd.DataFrame(g.values.tolist(), index=g.index).reset_index()
# Fix column names
new_df.columns = [x for x in ascii_lowercase[:new_df.shape[1]]]
print(new_df)
a b c d
0 bar 4 5 6
1 foo 1 2 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.