简体   繁体   English

如何在数据帧 r 中查找特定行的索引

[英]How to find indices of specific rows in dataframe r

I have a dataframe, A, which looks like this:我有一个数据框 A,它看起来像这样:

col 1   col2   col3
 NL      6       9
 UK      5       5
 US      9       7

and I have a dataframe, B, consisting of a subset of the rows of the large dataframe looking like this:我有一个数据框 B,由大数据框的行的子集组成,如下所示:

 col 1   col2   col3
 NL      6       9
 UK      5       5

Now, I want to find the indices of the rows from B in A, so it should return 1 and 2 .现在,我想从 A 中的 B 中找到行的索引,因此它应该返回1 and 2 Does someone know how to do this?有人知道怎么做吗?

EDIT Next, I also want to find the indices of the rows in A, when I have only the first two columns in B. So, in that case it should also return 1 and 2 .编辑接下来,当我在 B 中只有前两列时,我还想找到 A 中行的索引。所以,在这种情况下,它也应该返回1 and 2 Anyone an idea how to do this?任何人都知道如何做到这一点?

Generally, match gets the index.通常, match获取索引。 In our case, an approach is to paste the rows together and get the index with match在我们的例子中,一种方法是将行paste在一起并获得match的索引

match(do.call(paste, df2), do.call(paste, df1)

If there are only subset of columns that are having the same column names, get the vector of column names with intersect , subset the datasets, do the paste and get the index with match如果只有具有相同列名的列子集,请使用intersect获取列名的向量,对数据集进行子集化,进行paste并获取match的索引

nm1 <- intersect(names(df1), names(df2))
match(do.call(paste, df2[nm1]), do.call(paste, df1[nm1]))

Another option is join where we create a row index in both datasets, do a join and extract the row index另一种选择是join ,我们在两个数据集中创建行索引,进行连接并提取行索引

 library(dplyr)
 df2 %>%
    mutate(rn = row_number()) %>% 
   left_join(df2 %>% 
          mutate(rn = row_number()), by = c('col1', 'col2', 'col3')) %>% 
   pull(rn.y)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM