如何根據一行是否包含另一行中的值組合數據框中的行

Question

我有一個看起來像這樣的數據框，帶有附加列：

ID         Paired_ID      ... 
123_1      123_2
123_2      123_1
456_1      456_2
456_2      456_1
789_1      789_2
789_2      789_1
789_3      789_4
789_4      789_3

我想要做的是，對於特定的 ID，取其 Paired_ID 為 ID 的行，並將兩行合並為一行。 我一直在嘗試使用熊貓合並（

pd.merge(df, df, left_on="ID", right_on="Paired_ID"

但我得到了重復，無法弄清楚如何擺脫它們。

我想：

ID_x        Paired_ID_x      ID_y     Paired_ID_y  ...
123_1      123_2             123_2      123_1
456_1      456_2             456_2      456_1
789_1      789_2             789_2      789_1
789_3      789_4             789_4      789_3

Answer 1

假設是 ID 中的每個值都在 paired_ID 中。

比較'_'分隔符后的結尾並創建兩個新的數據幀，

連接列軸上的數據框以獲取輸出。

#this extracts the ends of each value in ID and Paired_ID
A = df.ID.str.split('_').str[-1].astype(int)
B = df.Paired_ID.str.split('_').str[-1].astype(int)

#compare, filter df based on the comparison outcome and add suffixes
filter_1 = df.loc[A.le(B)].reset_index(drop=True).add_suffix('_x')
filter_2 = df.loc[~A.le(B)].reset_index(drop=True).add_suffix('_y')

#concatenate along the columns axis to get outcome
pd.concat([filter_1,filter_2],axis=1)


    ID_x    Paired_ID_x ID_y    Paired_ID_y
0   123_1   123_2       123_2   123_1
1   456_1   456_2       456_2   456_1
2   789_1   789_2       789_2   789_1
3   789_3   789_4       789_4   789_3

如何根據一行是否包含另一行中的值組合數據框中的行

問題描述

1 個解決方案

解決方案1
0 2020-02-27 03:46:59

如何根據一行是否包含另一行中的值組合數據框中的行

問題描述

1 個解決方案

解決方案1 0 2020-02-27 03:46:59

解決方案1
0 2020-02-27 03:46:59