[英]pandas Add and rename multiple columns based on single column using concat
我有這個df:
group owner failed granted_pe slots
0 g1 u1 0 single 1
1 g50 u92 0 shared 8
2 g50 u92 0 shared 1
可以使用以下代碼創建df
:
df = pd.DataFrame([['g1', 'u1', 0, 'single', 1],
['g50', 'u92', '0', 'shared', '8'],
['g50', 'u92', '0', 'shared', '1']],
columns=['group', 'owner', 'failed','granted_pe', 'slots'])
df = (df.astype(dtype={'group':'str', 'owner':'str','failed':'int', 'granted_pe':'str', 'slots':'int'}))
print(df)
使用groupby,我創建了三個在“ slots”列上計算的列:
df_calculated = pd.concat([
df.loc[:,['group', 'slots']].groupby(['group']).sum(),
df.loc[:,['group', 'slots']].groupby(['group']).mean(),
df.loc[:,['group', 'slots']].groupby(['group']).max()
], axis=1)
print(df_calculated)
slots slots slots
group
g1 1 1.0 1
g50 9 4.5 8
問題1 :適當命名新列
我可以在concat中添加參數以將這些列命名為“ slots_sum”,“ slots_avg”和“ slots_max”嗎?
問題2 :將列添加到df
我希望將新列添加到df的“源”列(在本例中為“插槽”)的右側。 所需的輸出如下所示:
group owner failed granted_pe slots slots_sum slots_avg slots_max
0 g1 u1 0 single 1 1 1.0 1
1 g50 u92 0 shared 8 9 4.5 8
2 g50 u92 0 shared 1
我的實際df是450萬行(23列)。 我將對其他專欄做類似的事情。
將agg
與add_prefix
使用,然后merge
其merge
回去
yourdf=df.merge(df.groupby('group')['slots'].agg(['sum','mean','max']).add_prefix('slots_').reset_index(),how='left')
Out[86]:
group owner failed ... slots_sum slots_mean slots_max
0 g1 u1 0 ... 1 1.0 1
1 g50 u92 0 ... 9 4.5 8
2 g50 u92 0 ... 9 4.5 8
另一種方法是在pd.concat中使用keys
參數,然后合並multiindex列標題
df = pd.DataFrame([['g1', 'u1', 0, 'single', 1],
['g50', 'u92', '0', 'shared', '8'],
['g50', 'u92', '0', 'shared', '1']],
columns=['group', 'owner', 'failed','granted_pe', 'slots'])
df = (df.astype(dtype={'group':'str', 'owner':'str','failed':'int', 'granted_pe':'str', 'slots':'int'}))
df_calculated = pd.concat([
df.loc[:,['group', 'slots']].groupby(['group']).sum(),
df.loc[:,['group', 'slots']].groupby(['group']).mean(),
df.loc[:,['group', 'slots']].groupby(['group']).max()
], axis=1, keys=['sum','mean','max'])
df_calculated.columns = [f'{j}_{i}' for i,j in df_calculated.columns]
print(df_calculated)
輸出:
slots_sum slots_mean slots_max
group
g1 1 1.0 1
g50 9 4.5 8
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.