[英]How to find indices of specific rows in dataframe r
I have a dataframe, A, which looks like this:我有一个数据框 A,它看起来像这样:
col 1 col2 col3
NL 6 9
UK 5 5
US 9 7
and I have a dataframe, B, consisting of a subset of the rows of the large dataframe looking like this:我有一个数据框 B,由大数据框的行的子集组成,如下所示:
col 1 col2 col3
NL 6 9
UK 5 5
Now, I want to find the indices of the rows from B in A, so it should return 1 and 2
.现在,我想从 A 中的 B 中找到行的索引,因此它应该返回
1 and 2
。 Does someone know how to do this?有人知道怎么做吗?
EDIT Next, I also want to find the indices of the rows in A, when I have only the first two columns in B. So, in that case it should also return 1 and 2
.编辑接下来,当我在 B 中只有前两列时,我还想找到 A 中行的索引。所以,在这种情况下,它也应该返回
1 and 2
。 Anyone an idea how to do this?任何人都知道如何做到这一点?
Generally, match
gets the index.通常,
match
获取索引。 In our case, an approach is to paste
the rows together and get the index with match
在我们的例子中,一种方法是将行
paste
在一起并获得match
的索引
match(do.call(paste, df2), do.call(paste, df1)
If there are only subset of columns that are having the same column names, get the vector of column names with intersect
, subset the datasets, do the paste
and get the index with match
如果只有具有相同列名的列子集,请使用
intersect
获取列名的向量,对数据集进行子集化,进行paste
并获取match
的索引
nm1 <- intersect(names(df1), names(df2))
match(do.call(paste, df2[nm1]), do.call(paste, df1[nm1]))
Another option is join
where we create a row index in both datasets, do a join and extract the row index另一种选择是
join
,我们在两个数据集中创建行索引,进行连接并提取行索引
library(dplyr)
df2 %>%
mutate(rn = row_number()) %>%
left_join(df2 %>%
mutate(rn = row_number()), by = c('col1', 'col2', 'col3')) %>%
pull(rn.y)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.