简体   繁体   中英

Pandas multiple conditions groupby

How can i achieve to groupby multiple conditions. example:

column CL rows == a,b,c groupby column A & C .[TOTAL].min() and column CL rows == rows d,e,f groupby B [TOTAL].min()

CL  | A | B | C | TOTAL
a   | 1 | 6 | 5 | 125,000
b   | 2 | 5 | 5 | 140,000
c   | 3 | 4 | 5 | 148,000
d   | 4 | 3 | 6 | 125,000
e   | 5 | 2 | 6 | 136,000
f   | 6 | 1 | 6 | 156,000

原表

Ok, do I see 2 options here:

(1) you do the grouping and aggregating separately, then merge it back:

pd.concat([df.loc[df["CL"].isin(["a", "b", "c"])].groupby(["A", "C"])["TOTAL"].min(),
df.loc[df["CL"].isin(["d", "e", "f"])].groupby("B")["TOTAL"].min()])

Outputs:

(1, 5)    125000
(2, 5)    140000
(3, 5)    148000
1         156000
2         136000
3         125000
Name: TOTAL, dtype: int64

(2) alternatively - you need to make up a dummy grouping key - you can do it for instance by masking unwanted grouping keys by -1 , so:

import numpy as np

#using the copy so original df won't be amended:
df2=df.copy()

#mask unwanted grouping keys by any object, other than None (None-s are automatically excluded from the group)
#choose key, so it won't get mixed up with any of other grouping keys

df2["B"]=np.where(df["CL"].isin(["a", "b", "c"]), -1, df["B"])
df2["A"]=np.where(df["CL"].isin(["a", "b", "c"]), df["A"], -1)
df2["C"]=np.where(df["CL"].isin(["a", "b", "c"]), df["C"], -1)

df2.groupby(["A", "B", "C"])["TOTAL"].min()

Outputs:

A   B   C
-1   1  -1    156000
     2  -1    136000
     3  -1    125000
 1  -1   5    125000
 2  -1   5    140000
 3  -1   5    148000
Name: TOTAL, dtype: int64

在此处输入图片说明 I ended up resolving by adding an extra column 'test' with the following code:

z['test'] = np.where(z['ACTIVITY_PHASE'].isin(['FRAC','COIL']), z['TOTAL'], 
                (np.where(z['ACTIVITY_PHASE']=='PREW', z.groupby(z['ACTIVITY_PHASE']=='PREW')['TOTAL'].transform('min'), 
                (np.where(z['ACTIVITY_PHASE']=='WINF', z.groupby(z['ACTIVITY_PHASE']=='WINF')['TOTAL'].transform('min'), 
                (np.where(z['ACTIVITY_PHASE']=='WOR', z.groupby(z['ACTIVITY_PHASE']=='WOR')['TOTAL'].transform('min'), 0)))))))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM