I have a pandas dataframe as follows:
df = pd.DataFrame({
'State':['am','am','am','am','am','am','am','am','am','fg','fg','fg','fg','fg','fg','fg'],
'PC':['A','A','A','A','B','B','B','B','B','C','C','C','D','D','D','D'],
'Party':['alpha','beta','delta','yellow','alpha','beta','blue','pink','gamma','alpha','beta','kappa','alpha','gamma','kappa','lambda'],
'Votes':[10,15,50,5,11,2,5,4,60,3,1,70,12,34,52,43]
})
I want to add a Total column, which will contain the sums of the votes for each PC. Note that the PC can have the same name (eg 'A' above in two different states 'am' and 'fg', so we want to sum them separately, since they are different pc). This I do as follows
df['Total'] = df.groupby(['State','PC']).Votes.transform('sum')
After that I want to retain only the top two 'Party' by 'Vote' for each combination of 'State' and 'PC', except when the top two does not include 'beta'. In that case I want a third row for 'beta'. And, then I want to capture any remaining 'Vote' count in a new row with 'Party' as 'REST' as needed.
In sum I want the output as follows:
df_out = pd.DataFrame({
'State':['am','am','am','am','am','am','am','fg','fg','fg','fg','fg','fg'],
'PC':['A','A','A','B','B','B','B','C','C','C','A','A','A'],
'Party':['delta','beta','REST','gamma','alpha','REST','beta','kappa','alpha','beta','kappa','lambda','REST'],
'Votes':[50,15,15,60,11,9,2,70,3,1,52,43,46],
'Total':[80,80,80,82,82,82,82,74,74,74,141,141,141]
})
How do I do this?
Here is one way using groupby
head
, and combine others with groupby
+ agg
, then concat
back , here if the first two do not include beta, I am adding that row back s1
s1=df.sort_values('Votes').groupby(['PC','State']).tail(2)
s2=df[~df.index.isin(s1.index)]
s1=pd.concat([s1,s2.loc[s2.Party=='beta']])
s2=s2[~s2.index.isin(s1.index)].groupby(['PC','State']).agg({'Votes':'sum','Total':'first'}).assign(Party='REST')
yourdf=pd.concat([s1,s2.reset_index()],sort=True).sort_values(['PC','State'])
yourdf
Out[517]:
PC Party State Total Votes
1 A beta am 80 15
2 A delta am 80 50
0 A REST am 80 15
4 B alpha am 82 11
8 B gamma am 82 60
5 B beta am 82 2
1 B REST am 82 9
9 C alpha fg 74 3
11 C kappa fg 74 70
10 C beta fg 74 1
15 D lambda fg 141 43
14 D kappa fg 141 52
2 D REST fg 141 46
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.