[英]Pandas: Check True/False if id appears at least 3 times in a particular column in pandas dataframe
[英]To check Pandas Dataframe column for TRUE/FALSE, if TRUE check another column for condition to satisfy and generate new column with values PASS/FAIL
輸入解釋:我有一個 dataframe 'df',它包含列 'Space' 和 'Threshold'。
Space Threshold
TRUE 0.1
TRUE 0.25
FALSE 0.5
FALSE 0.6
要考慮的場景:當 df['Space'] 為 TRUE 時,檢查 df['Threshold']<=0.2,如果兩個條件都滿足,則生成一個名為 df['Space_Test'] 的新列,其值為 PASS/FAIL。 如果 df['Space'] 值為 FALSE,則將 'FALSE' 作為新生成的列 df['Space_Test'] 的值。
預期 Output:
Space Threshold Space_Test
TRUE 0.1 PASS
TRUE 0.25 FAIL
FALSE 0.5 FALSE
FALSE 0.6 FALSE
嘗試過的代碼:已經為上述場景嘗試了下面提到的代碼行,但不起作用。
df['Space_Test'] = np.where(df['Space'] == 'TRUE',np.where(df['Threshold'] <= 0.2, 'Pass', 'Fail'),'FALSE')
需要幫助來解決這個問題。 提前致謝!
如果TRUE
是 boolean 您的解決方案僅通過df['Space']
比較來簡化:
df['Space_Test'] = np.where(df['Space'],
np.where(df['Threshold'] <= 0.2, 'Pass', 'Fail'),'FALSE')
print (df)
Space Threshold Space_Test
0 True 0.10 Pass
1 True 0.25 Fail
2 False 0.50 FALSE
3 False 0.60 FALSE
替代numpy.select
:
m1 = df['Space']
m2 = df['Threshold'] <= 0.2
df['Space_Test'] = np.select([m1 & m2, m1 & ~m2], ['Pass', 'Fail'],'FALSE')
print (df)
Space Threshold Space_Test
0 True 0.10 Pass
1 True 0.25 Fail
2 False 0.50 FALSE
3 False 0.60 FALSE
另一種解決方案
from pandas import DataFrame
names = {
'Space': ['TRUE','TRUE','FALSE','FALSE'],
'Threshold': [0.1, 0.25, 1, 2]
}
df = DataFrame(names,columns=['Space','Threshold'])
df.loc[(df['Space'] == 'TRUE') & (df['Threshold'] <= 0.2), 'Space_Test'] = 'Pass'
df.loc[(df['Space'] != 'TRUE') | (df['Threshold'] > 0.2), 'Space_Test'] = 'Fail'
print (df)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.