Drop columns in pandas dataframe based on conditions

Question

Assume that I have the following dataframe:

+---+---------+------+------+------+
|   | summary | col1 | col2 | col3 |
+---+---------+------+------+------+
| 0 | count   | 10   | 10   | 10   |
+---+---------+------+------+------+
| 1 | mean    | 4    | 5    | 5    |
+---+---------+------+------+------+
| 2 | stddev  | 3    | 3    | 3    |
+---+---------+------+------+------+
| 3 | min     | 0    | -1   | 5    |
+---+---------+------+------+------+
| 4 | max     | 100  | 56   | 47   |
+---+---------+------+------+------+

How can I keep only the columns where count > 5 , mean>4 and min>0 including the column summary as well?

The desired output is:

+---+---------+------+
|   | summary | col3 |
+---+---------+------+
| 0 | count   | 10   |
+---+---------+------+
| 1 | mean    | 5    |
+---+---------+------+
| 2 | stddev  | 3    |
+---+---------+------+
| 3 | min     | 5    |
+---+---------+------+
| 4 | max     | 47   | 
+---+---------+------+

Answer 1

You need:

df2 = df.set_index('summary').T
m1 = df2['count'] > 5
m2 = df2['mean'] > 4
m3 = df2['min'] > 0
df2.loc[m1 & m2 & m3].T.reset_index()

Output:

    summary col3
0   count   10
1   mean    5
2   stddev  3
3   min     5
4   max     47

Note: You can easily use the conditions directly in .loc[] , but when we have multiple conditions, it is best to use separate mask variables ( m1 , m2 , m3 )

Answer 2

loc with callable.

(df.set_index('summary').T
   .loc[lambda x: (x['count'] > 5) & (x['mean'] > 4) & (x['min'] > 0)]
   .T.reset_index())

Answer 3

Here is one way

s=df.set_index('summary')
com=pd.Series([5,4,0],index=['count','mean','min'])
idx=s.loc[com.index].gt(com,axis=0).all().loc[lambda x : x].index
s[idx]
Out[142]: 
         col3
summary      
count      10
mean        5
stddev      3
min         5
max        47

Answer 4

General thrashing about plus `query`

(
    df.set_index('summary')
      .rename(str.title).T
      .query('Count > 5 & Mean > 4 and Min > 0')
      .T.rename(str.lower)
      .reset_index()
)

  summary  col3
0   count    10
1    mean     5
2  stddev     3
3     min     5
4     max    47

Shenanigans

(
    df[['summary']].join(
        df.iloc[:, 1:].loc[:, df.iloc[[0, 1, 3], 1:].T.gt([5, 4, 0]).all(1)]
    )
)
  summary  col3
0   count    10
1    mean     5
2  stddev     3
3     min     5
4     max    47

Answer 5

将summary列设置为索引，然后执行以下操作：

df.T.query("(count > 5) & (mean > 4) & (min > 0)").T

Drop columns in pandas dataframe based on conditions

Question

5 answers

solution1
3 ACCPTED 2019-08-12 16:02:28

solution2
2 2019-08-12 16:12:58

solution3
1 2019-08-12 16:04:10

solution4
1 2019-08-12 16:14:21

General thrashing about plus `query`

Shenanigans

solution5
0 2019-08-12 16:04:54

Drop columns in pandas dataframe based on conditions

Question

5 answers

solution1 3 ACCPTED 2019-08-12 16:02:28

solution2 2 2019-08-12 16:12:58

solution3 1 2019-08-12 16:04:10

solution4 1 2019-08-12 16:14:21

General thrashing about plus query

Shenanigans

solution5 0 2019-08-12 16:04:54

solution1
3 ACCPTED 2019-08-12 16:02:28

solution2
2 2019-08-12 16:12:58

solution3
1 2019-08-12 16:04:10

solution4
1 2019-08-12 16:14:21

General thrashing about plus `query`

solution5
0 2019-08-12 16:04:54