熊猫自我加入非独特的价值观

Question

I have the following table: 我有下表：

       ind_ID  pair_ID orig_data
0           A        1         W 
1           B        1         X
2           C        2         Y
3           D        2         Z
4           A        3         W          
5           C        3         X          
6           B        4         Y          
7           D        4         Z

Each row has an individual_ID , and a pair_ID that it shares with exactly one other row. 每行都有一个individual_ID ，以及一个与另一行完全共享的pair_ID 。 I want to do a self join, so that every row has its original data, and the data of the row it shares a pair_ID with: 我想做一个自连接，以便每一行都有它的原始数据，并且它共享一对pair_ID的行的数据：

       ind_ID  pair_ID orig_data partner_data
0           A        1         W            X
1           B        1         X            W
2           C        2         Y            Z
3           D        2         Z            Y
4           A        3         W            X
5           C        3         X            W
6           B        4         Y            Z
7           D        4         Z            Y

I have tried: 我努力了：

df.join(df, on='pair_ID')

But obviously since pair_ID values are not unique I get: 但很明显，因为pair_ID值不是唯一的，我得到：

       ind_ID  pair_ID orig_data partner_data
0           A        1         W          NaN
1           B        1         X          NaN
2           C        2         Y          NaN
3           D        2         Z          NaN
4           A        3         W          NaN
5           C        3         X          NaN
6           B        4         Y          NaN
7           D        4         Z          NaN

I've also thought about creating a new column that concatenates ind_ID+pair_ID which would be unique, but then the join would not know what to match on. 我还考虑过创建一个连接ind_ID+pair_ID的新列，这个列是唯一的，但是连接不会知道要匹配什么。

Is it possible to do a self-join on pair_ID where each row is joined with the matching row that is not itself? 是否可以在pair_ID上进行自pair_ID ，其中每一行都与匹配的行本身连接？

Answer 1

In your case (with only two pairs) - you can probably just groupby and transform based on the ID, and just reverse the order of the values in the group, eg: 在你的情况下（只有两对） - 你可能只是基于ID进行分组和变换，只需反转组中值的顺序，例如：

df.loc[:, 'partner_data'] = df.groupby('pair_ID').orig_data.transform(lambda L: L[::-1])

Which gives you: 哪个给你：

  ind_ID  pair_ID orig_data partner_ID
0      A        1         W          X
1      B        1         X          W
2      C        2         Y          Z
3      D        2         Z          Y
4      A        3         W          X
5      C        3         X          W
6      B        4         Y          Z
7      D        4         Z          Y

熊猫自我加入非独特的价值观

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-02-19 21:41:42

熊猫自我加入非独特的价值观

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-02-19 21:41:42

解决方案1
3 已采纳 2018-02-19 21:41:42