[英]Drop rows based on conditions and create new columns of pandas dataframe
[英]Drop columns in pandas dataframe based on conditions
假設我有以下數據框:
+---+---------+------+------+------+
| | summary | col1 | col2 | col3 |
+---+---------+------+------+------+
| 0 | count | 10 | 10 | 10 |
+---+---------+------+------+------+
| 1 | mean | 4 | 5 | 5 |
+---+---------+------+------+------+
| 2 | stddev | 3 | 3 | 3 |
+---+---------+------+------+------+
| 3 | min | 0 | -1 | 5 |
+---+---------+------+------+------+
| 4 | max | 100 | 56 | 47 |
+---+---------+------+------+------+
如何僅保留count > 5
, mean>4
和min>0
的列以及列summary
?
所需的輸出是:
+---+---------+------+
| | summary | col3 |
+---+---------+------+
| 0 | count | 10 |
+---+---------+------+
| 1 | mean | 5 |
+---+---------+------+
| 2 | stddev | 3 |
+---+---------+------+
| 3 | min | 5 |
+---+---------+------+
| 4 | max | 47 |
+---+---------+------+
你需要:
df2 = df.set_index('summary').T
m1 = df2['count'] > 5
m2 = df2['mean'] > 4
m3 = df2['min'] > 0
df2.loc[m1 & m2 & m3].T.reset_index()
輸出:
summary col3
0 count 10
1 mean 5
2 stddev 3
3 min 5
4 max 47
注意:您可以直接在.loc[]
輕松使用條件,但是當我們有多個條件時,最好使用單獨的掩碼變量( m1
, m2
, m3
)
loc
與可調用。
(df.set_index('summary').T
.loc[lambda x: (x['count'] > 5) & (x['mean'] > 4) & (x['min'] > 0)]
.T.reset_index())
這是一種方法
s=df.set_index('summary')
com=pd.Series([5,4,0],index=['count','mean','min'])
idx=s.loc[com.index].gt(com,axis=0).all().loc[lambda x : x].index
s[idx]
Out[142]:
col3
summary
count 10
mean 5
stddev 3
min 5
max 47
query
一般query
(
df.set_index('summary')
.rename(str.title).T
.query('Count > 5 & Mean > 4 and Min > 0')
.T.rename(str.lower)
.reset_index()
)
summary col3
0 count 10
1 mean 5
2 stddev 3
3 min 5
4 max 47
(
df[['summary']].join(
df.iloc[:, 1:].loc[:, df.iloc[[0, 1, 3], 1:].T.gt([5, 4, 0]).all(1)]
)
)
summary col3
0 count 10
1 mean 5
2 stddev 3
3 min 5
4 max 47
將summary
列設置為索引,然后執行以下操作:
df.T.query("(count > 5) & (mean > 4) & (min > 0)").T
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.