I have a huge dataframe and i want to merge only two rows in it based on if condition
. Below is a sample data frame and when I tried to do groupby sum
other rows are also getting effected.
I only want column jb_name
with generic
to get merged and sum
.
jb_name jb_count
0 generic 10
1 generic1 2
2 generic 15
3 other 14
tried the following but as i said its effecting other rows as well
df = df.groupby(['jb_name'])['jb_count'].sum().reset_index()
I want the final df as following
jb_name jb_count
0 generic 25
1 generic1 2
3 other 14
Use:
mask = df['jb_name'] == 'generic'
df = df[mask].groupby('jb_name', as_index=False).sum().append(df[~mask], ignore_index=True)
Alternatively we can set the index to jb_name
and use sum
on level 0
where index is generic
:
df = df.set_index('jb_name')
mask = (df.index == 'generic')
df1 = pd.concat([df[mask].sum(level=0), df[~mask]]).reset_index()
Result:
# print(df1)
jb_name jb_count
0 generic 25
1 generic1 2
2 other 14
#Bollean select, droupby as you sum the duplicated and append tthe no duplicates
m=df.jb_name=='generic'
df[m].groupby(by='jb_name', axis=0).sum().reset_index().append(df[~m])
jb_name jb_count
0 generic 25
1 generic1 2
3 other 14
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.