简体   繁体   中英

pandas - checking a condition for each group in a dataframe

I have got a dataframe:

df = pd.DataFrame({'index' : range(8),
'variable1' : ["A","A","B","B","A","B","B","A"],
'variable2' : ["a","b","a","b","a","b","a","b"],
'variable3' : ["x","x","x","y","y","y","x","y"],
'result': [1,0,0,1,1,0,0,1]})

df2 = df.pivot_table(values='result',rows='index',cols=['variable1','variable2','variable3'])
df2['A']['a']['x'][4] = 1
df2['B']['a']['x'][3] = 1

variable1   A               B    
variable2   a       b       a   b
variable3   x   y   x   y   x   y
index                            
0           1 NaN NaN NaN NaN NaN
1         NaN NaN   0 NaN NaN NaN
2         NaN NaN NaN NaN   0 NaN
3         NaN NaN NaN NaN   1   1
4           1   1 NaN NaN NaN NaN
5         NaN NaN NaN NaN NaN   0
6         NaN NaN NaN NaN   0 NaN
7         NaN NaN NaN   1 NaN NaN

Now I want to check for simultaneous occurrences of x == 1 and y == 1 , but only within each subgroup, defined by variable1 and variable2 . So, for the dataframe shown above, the condition is met for index == 4 (group Aa ), but not for index == 3 (groups Ba and Bb ).

I suppose some groupby() magic would be needed, but I cannot find the right way. I have also tried experimenting with a stacked dataframe (using df.stack() ), but this did not get me any closer...

You can use groupby on the 2 first levels variable1 and variable2 to get the sum of the x and y columns at that level:

r = df2.groupby(level=[0,1], axis=1).sum()

r
Out[50]: 
variable1   A       B    
variable2   a   b   a   b
index                    
0           1 NaN NaN NaN
1         NaN   0 NaN NaN
2         NaN NaN   0 NaN
3         NaN NaN   1   1
4           2 NaN NaN NaN
5         NaN NaN NaN   0
6         NaN NaN   0 NaN
7         NaN   1 NaN NaN

Consequently, the lines you are searching for are the ones that contain the value 2:

r[r==2].dropna(how='all')
Out[53]: 
variable1  A       B    
variable2  a   b   a   b
index                   
4          2 NaN NaN NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM