简体   繁体   中英

Apply a value across a group of a Pandas Data Frame

I am trying to summarize values across each group where the types match and apply that to the row where store=1.

The example below for Group A contains one store=1 and three store=2.

I would like to roll up all type 3's in Level=A to the store=1 row

Sample data:

data = {'group':['A','A','A','A','B','B','B','B'],'store':['1','2','2','2','1','2','2','2'],'type':['3','3','1','1','5','0','5','5'],'num':['10','20','30','40','50','60','70','80']}
t1=pd.DataFrame(data)
group store type num    
A     1     3    10 
A     2     3    20 
A     2     1    30
A     2     1    40 
B     1     5    50 
B     2     0    60 
B     2     5    70 
B     2     5    80

and the correct output should be a new column ('new_num') containing a list at the store=1 row for each group where the types match.

group store type num new_num
A     1     3    10  ['10','20']
A     2     3    20  []
A     2     1    30  []
A     2     1    40  []
B     1     5    50  ['50','70','80']
B     2     0    60  []
B     2     5    70  []
B     2     5    80  []

IIUC

t1['new_num']=[[] for x in range(len(t1))]
t1.loc[t1.store=='1','new_num']=[y.loc[y.type.isin(y.loc[y.store=='1','type']),'num'].tolist() for x , y in t1.groupby('group',sort=False)]
t1
Out[369]: 
  group store type num       new_num
0     A     1    3  10      [10, 20]
1     A     2    3  20            []
2     A     2    1  30            []
3     A     2    1  40            []
4     B     1    5  50  [50, 70, 80]
5     B     2    0  60            []
6     B     2    5  70            []
7     B     2    5  80            []

Setup

ncol = [[] for _ in range(t1.shape[0])]
res = t1.set_index('group').assign(new_num=ncol)

1) Using some wonky string concats and groupby 's

u = t1.group + t1.type
check = u[t1.store.eq('1')]
m = t1.loc[u.isin(check)].groupby('group')['num'].agg(list)

res.loc[res.store.eq('1'), 'new_num'] = m

2) If you'd like to stray even further from the light, use an abomination of a pivot

f = t1.pivot_table(
  index=['group', 'type'],
  columns='store',
  values='num',
  aggfunc=list
).reset_index()

m = f[f['1'].notnull()].set_index('group').drop('type', 1).sum(1)

res.loc[res.store.eq('1'), 'new_num'] = m

Both somehow manage to produce:

      store type num       new_num
group
A         1    3  10      [10, 20]
A         2    3  20            []
A         2    1  30            []
A         2    1  40            []
B         1    5  50  [50, 70, 80]
B         2    0  60            []
B         2    5  70            []
B         2    5  80            []

While a terrible use of pivot , I actually think that solution is pretty neat:

store group type     1         2
0         A    1   NaN  [30, 40]
1         A    3  [10]      [20]
2         B    0   NaN      [60]
3         B    5  [50]  [70, 80]

It produces the above aggregation, which you can find the non-null values which are all of the matching group-type combinations that you are after, and summing across those rows gives you the aggregated list you need.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM