[英]pandas find similar row in a column, make a new column according to the condition
我有一個 df,查詢主題 HPSame
0 WP_77.1 WP_706.1 HPS_1
1 WP_78.1 WP_46.1 HPS_2
2 WP_57.1 WP_26.1 HPS_3
3 WP_57.1 WP_627.1 HPS_4
4 WP_15.1 WP_16.1 HPS_5
5 WP_15.1 WP_17.1 HPS_6
6 WP_15.1 WP_63.1 HPS_7
7 WP_15.1 WP_61.1 HPS_8
8 WP_15.1 WP_56.1 HPS_9
9 WP_40.1 WP_11.1 HPS_10
我試過,
df['query_s'] = df['query'].shift(-1)
df['HPSame_s'] = df['HPSame'].shift(-1)
condition = [(df['query'] == df['query_s'])]
ifTrue = df['HPSame']
ifFalse = df['HPSame_s']
df['match'] = np.where(condition, ifTrue, ifFalse)
這引發了我的 ValueError: Length of values does not match length of index
我也試過,以下但沒有給我我想要的結果。
df.loc[(df['query'] == df['query_s']), 'match'] = df['HPSame']
df.loc[(df['query'] != df['query_s']), 'match'] = df['HPSame_s']
我正在尋找結果,df = 查詢主題 HPSame 匹配 0 WP_77.1 WP_706.1 HPS_1 HPS_1 1 WP_78.1 WP_46.1 HPS_2 HPS_2 2 WP_57.1 WP_26.1 HPS_3 HPS_3 3 WP_706.1 HPS_1 HPS_1 1 WP_78.1 WP_15.1 WP_16.1 HPS_5 HPS_5 5 WP_15.1 WP_17.1 HPS_6 HPS_5 6 WP_15.1 WP_63.1 HPS_7 HPS_5 7 WP_15.1 WP_61.1 HPS_8 HPS_5 WP195 WP195 HPS_5 WP195 HPS_5 WP_15.1 1 HPS_10 HPS_10
您可以使用ffill
:
df['match'] = df['HPSame'].where(df['query'] != df['query'].shift()).ffill()
輸出:
query subject HPSame match
0 WP_77.1 WP_706.1 HPS_1 HPS_1
1 WP_78.1 WP_46.1 HPS_2 HPS_2
2 WP_57.1 WP_26.1 HPS_3 HPS_3
3 WP_57.1 WP_627.1 HPS_4 HPS_3
4 WP_15.1 WP_16.1 HPS_5 HPS_5
5 WP_15.1 WP_17.1 HPS_6 HPS_5
6 WP_15.1 WP_63.1 HPS_7 HPS_5
7 WP_15.1 WP_61.1 HPS_8 HPS_5
8 WP_15.1 WP_56.1 HPS_9 HPS_5
9 WP_40.1 WP_11.1 HPS_10 HPS_10
您還可以使用groupby.transform('first')
如
df['match'] = (
df.groupby('query')['HPSame'].transform('first')
.reset_index(drop=True)
)
輸出
query subject HPSame match
0 WP_77.1 WP_706.1 HPS_1 HPS_1
1 WP_78.1 WP_46.1 HPS_2 HPS_2
2 WP_57.1 WP_26.1 HPS_3 HPS_3
3 WP_57.1 WP_627.1 HPS_4 HPS_3
4 WP_15.1 WP_16.1 HPS_5 HPS_5
5 WP_15.1 WP_17.1 HPS_6 HPS_5
6 WP_15.1 WP_63.1 HPS_7 HPS_5
7 WP_15.1 WP_61.1 HPS_8 HPS_5
8 WP_15.1 WP_56.1 HPS_9 HPS_5
9 WP_40.1 WP_11.1 HPS_10 HPS_10
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.