根据特定条件使用熊猫选择行

Question

I'd like to return the rows which has all columns > 0 or where only 2012 can be < 0. 我想返回所有列均大于0或仅2012年可以小于0的行。

import pandas as pd
import numpy as np

df = pd.DataFrame( {
   'A': ['d','d','d','f','f','f','g','g','g','h','h','h'],
   'B': [5,5,6,7,5,6,6,7,7,6,7,7],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1],
   'S': [2012,2013,2014,2015,2016,2012,2013,2014,2015,2016,2012,2013]     
    } );

df = (df.B + df.C).groupby([df.A, df.S]).sum().unstack(fill_value=0)
print (df)

@jezrael, not exactly. @jezrael，不完全是。 I changed the dataframe to explain better. 我更改了数据框以进行更好的解释。 In the final result I need the rows where all columns are > 0 AND the ones where the columns are > 0, except for 2012. That one can be < 0. The result must show a new df with the columns that qualify. 在最终结果中，我需要所有列均> 0的行以及列> 0的行（2012年除外）。该行可以<0。结果必须显示带有合格列的新df。 So, in the example below, g yes, d no. 因此，在下面的示例中，g是，d否。

df = pd.DataFrame( {
   'A': ['d','d','d','d','d','d','g','g','g','g','g','g'],
   'B': [5,5,6,-7,5,6,-6,7,7,6,-7,7],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1],
   'S': [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,2013]     
    } );

df = (df.B + df.C).groupby([df.A, df.S]).sum().unstack(fill_value=0)

S  2012  2013  2014  2015  2016
A                              
d    13     6     7    -6     6
g   -11     8     8     8     7

EDITED Dataframe; 编辑的数据框；

df = pd.DataFrame( {
   'A':  ['d','d','d','d','d','d','g','g','g','g','g','g',
    'k','k','k','k','k','k'],
   'B': [5,5,6,7,5,6,-6,7,7,6,-7,7,-8,7,-6,6,-7,50],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2],
   'S':   [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,
        2013,2012,2013,2014,2015,2016,2014]     
    } );

df = (df.B + df.C).groupby([df.A, df.S]).sum().unstack(fill_value=0)
print (df)

S  2012  2013  2014  2015  2016
A                              
d    13     6     7     8     6
g   -11     8     8     8     7
k    -6     9     48     8    -5

Answer 1

I think you can use double mask one for compare rows and one for columns: 我认为您可以使用双重遮罩一个用于比较行，一个用于列：

df = pd.DataFrame( {
   'A': ['d','d','d','f','f','f','g','g','g','g','h','h','h', 'f'],
   'B': [5,5,6,7,5,6,-6,7,7,7,6,7,7,2],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1,1,1],
   'S': [2012,2013,2014,2015,2016,2012,2012,2013,2014,2015,2016,2012,2013,2013]     
    } );

df = (df.B + df.C).groupby([df.A, df.S]).sum().unstack(fill_value=0)
print (df)
S  2012  2013  2014  2015  2016
A                              
d     6     6     7     0     0
f     7     3     0     8     6
g    -5     8     8     8     0
h     8     8     0     0     7

mask1 = df[2012] < 0
print (mask1)
A
d    False
f    False
g     True
h    False
Name: 2012, dtype: bool

mask2 = (df > 0).all()
print (mask2)
S
2012    False
2013     True
2014    False
2015    False
2016    False
dtype: bool

print (df.loc[mask1, mask2])
S  2013
A      
g     8

print (df[mask1])
S  2012  2013  2014  2015  2016
A                              
g    -5     8     8     8     0

print (df.loc[:,mask2])
S  2013
A      
d     6
f     3
g     8
h     8

EDIT by edit of question: 通过问题编辑进行编辑：

mask1 = df[2012] < 0
print (mask1)
A
d    False
g     True
Name: 2012, dtype: bool

mask2 = (df.drop(2012, axis=1) > 0).all(axis=1)
print (mask2)
A
d    False
g     True
dtype: bool

print (df[mask1 & mask2])
S  2012  2013  2014  2015  2016
A                              
g   -11     8     8     8     7

Answer 2

Combine the operators and use parentheses: 合并运算符并使用括号：

df[((df > 0).all(axis=1)) | (df[2012] < 0)]
Out[22]: 
Empty DataFrame
Columns: [2012, 2013, 2014, 2015, 2016]
Index: []

根据特定条件使用熊猫选择行

问题描述

2 个解决方案

解决方案1
1 已采纳 2016-11-20 16:02:34

解决方案2
0 2016-11-20 15:45:51

根据特定条件使用熊猫选择行

问题描述

2 个解决方案

解决方案1 1 已采纳 2016-11-20 16:02:34

解决方案2 0 2016-11-20 15:45:51

解决方案1
1 已采纳 2016-11-20 16:02:34

解决方案2
0 2016-11-20 15:45:51