For example, if I have a DF like the following:
n from km to
0 B 300 A
1 A 300 B
2 D 290 A
3 B 310 C
4 A 290 D
I would like to select rows 0, 1, 2 and 4 since they all have another row in the same DF that has inverted from
and to
.
df2 = pd.DataFrame(columns=['to', 'from', 'km'])
for index, row in df.iterrows():
f, t = row['from'], row['to']
if ((df['to'] == f) & (df['from'] == t)).any():
df2 = df2.append(row)
> df2
to from km
0 A B 300
1 B A 300
2 A D 290
4 D A 290
Is it possible to do this without iteration over the rows?
Here is one way sort
your columns and find the duplicated
s=pd.DataFrame(np.sort(df[['from','to']].values,1)).duplicated(keep=False)
yourdf=df[s]
yourdf
Out[32]:
n from km to
0 0 B 300 A
1 1 A 300 B
2 2 D 290 A
4 4 A 290 D
Not as nice and short as the answer of Wen-Ben but maybe more intuitive. Merge the df
with itself:
ok = df.merge(df[['from', 'to']], left_on='to', right_on='from').query('from_x == to_y')['n']
df.loc[ok, :]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.