[英]Retrieving the columns based on a condition in dataframe
輸入 Dataframe:
id typeofAddress city state
1 Home Kolkata WB
1 Office Calcutta WB
2 Home Columbus OH
3 Home
3 Office SanFrancisco CA
我必須拉入typeofAddress
為Home
且city
不為空的行,否則拉入typeofAddress
為Office
的行
Output:
id typeofAddress city state
1 Home Kolkata WB
2 Home Columbus OH
3 Office SanFrancisco CA
根據您的優先級創建一個排名列:
condlist = [df['typeofAddress'].eq('Home') & df['city'].ne(''),
df['typeofAddress'].eq('Office') & df['city'].ne('')]
rank = np.select(condlist, choicelist=[1, 2], default=3)
out = df.assign(rank=rank).sort_values('rank') \
.groupby('id').first() \
.drop(columns='rank').reset_index()
Output:
>>> out
id typeofAddress city state
0 1 Home Kolkata WB
1 2 Home Columbus OH
2 3 Office SanFrancisco CA
您可以使用 DataFrame 上的 boolean 掩碼解決此問題。 您可以在谷歌上搜索“使用熊貓進行布爾掩碼”以獲取更多詳細信息。
import pandas as pd
d={'typeofAddress':['Home','Office','Home','Home','Office'],'city':['Kolkata','Calcutta','Columbus','','SanFrancisco'],'state':['WB','WB','OH','','CA']}
df=pd.DataFrame(d)
output=df[((df['typeofAddress']=='Home')&(df['city']!=''))|(df['typeofAddress']=='Office')]
output
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.