简体   繁体   English

R - 通过匹配值对行进行分组,然后将行添加到另一个数据框中的匹配列

[英]R - Grouping rows by matching value then adding rows to matching columns in another data frame

I am trying to add values from one data frame (ex2) to an existing data frame (ex1) based on two different columns.我正在尝试基于两个不同的列将一个数据框(ex2)中的值添加到现有数据框(ex1)。 As you can see, there is an ID column in both data frames.如您所见,两个数据框中都有一个 ID 列。 But in ex2, each column of ex1 is represented by a different row instead of a column.但是在 ex2 中,ex1 的每一列都由不同的行而不是列表示。 For each matching ID, I want to add the result from ex2$result to the matching row in ex1 under the appropriate column heading (if ex2$alpha[i] = a then ex2$result[i] gets added to ex1$a[z] where ex2$id[i]=ex1$id[z]).对于每个匹配的 ID,我想将 ex2$result 的结果添加到 ex1 中相应列标题下的匹配行(如果 ex2$alpha[i] = a 那么 ex2$result[i] 被添加到 ex1$a[ z] 其中 ex2$id[i]=ex1$id[z])。 Another complication is that not all of the columns from ex1 will have alpha value in ex2, so those should be set as 'NA'.另一个复杂因素是,并非 ex1 中的所有列在 ex2 中都具有 alpha 值,因此应将它们设置为“NA”。

ex1 <- data.frame(
  id = c(1:20),
  a = c(rep(1,5),rep(0,5),rep(NA,10)),
  b = c(rep(c(1,0),5),rep(NA,10)),
  c = c(rep(c(0,1),5),rep(NA,10)),
  d = c(rep(0,5),rep(1,5),rep(NA,10))
)

ex2 <- data.frame(
  id = c(rep(11,3),rep(12,3),rep(13,3),
         rep(14,2),rep(15,2),
         rep(16,4),rep(17,4),rep(18,4),rep(19,4),rep(20,4)),
  alpha = c(rep(c('a','b','d'),3),rep(c('a','b'),2),
rep(c('a','b','c','d'),5)),
  result = c(rep(c(0,1,1),11))
)

Thanks for your help!谢谢你的帮助!

I believe the attached snippet does what you want it to do.我相信附加的代码片段可以满足您的要求。 But it is hard to know from your toy data if it is feasible to write out the columns a to d in the mutate statement.但是从你的玩具数据中很难知道在 mutate 语句中写出 a 到 d 列是否可行。 There surely is a more clever programmatic way to approach this problem.肯定有一种更聪明的编程方式来解决这个问题。

ex1 <- data.frame(
    id = c(1:20),
    a = c(rep(1,5),rep(0,5),rep(NA,10)),
    b = c(rep(c(1,0),5),rep(NA,10)),
    c = c(rep(c(0,1),5),rep(NA,10)),
    d = c(rep(0,5),rep(1,5),rep(NA,10))
)

ex2 <- data.frame(
    id = c(rep(11,3),rep(12,3),rep(13,3),
          rep(14,2),rep(15,2),
          rep(16,4),rep(17,4),rep(18,4),rep(19,4),rep(20,4)),
    alpha = c(rep(c('a','b','d'),3),rep(c('a','b'),2),
            rep(c('a','b','c','d'),5)),
    result = c(rep(c(0,1,1),11))
)

library(tidyverse)
ex_2_wide <- pivot_wider(ex2, id_cols = id, names_from = alpha, values_from = result  )

joined <- full_join(ex1, ex_2_wide, by = c("id" = "id")) %>%
    mutate(a = coalesce(a.x, a.y)) %>%
    mutate(b = coalesce(b.x, b.y)) %>%
    mutate(c = coalesce(c.x, c.y)) %>%
    mutate(d = coalesce(d.x, d.y)) %>%
    select(-(a.x:c.y))

joined
#>    id a b  c  d
#> 1   1 1 1  0  0
#> 2   2 1 0  1  0
#> 3   3 1 1  0  0
#> 4   4 1 0  1  0
#> 5   5 1 1  0  0
#> 6   6 0 0  1  1
#> 7   7 0 1  0  1
#> 8   8 0 0  1  1
#> 9   9 0 1  0  1
#> 10 10 0 0  1  1
#> 11 11 0 1 NA  1
#> 12 12 0 1 NA  1
#> 13 13 0 1 NA  1
#> 14 14 0 1 NA NA
#> 15 15 1 0 NA NA
#> 16 16 1 1  0  1
#> 17 17 1 0  1  1
#> 18 18 0 1  1  0
#> 19 19 1 1  0  1
#> 20 20 1 0  1  1

Created on 2021-01-07 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 1 月 7 日创建

EDIT:编辑:

If we turn the problem around (we first make long tables, followed by join and merge, then pivot back wide), there is only a single step for merger, no matter how many columns you have.如果我们把问题转过来(我们先做长表,然后连接和合并,然后 pivot 回宽),合并只有一步,不管你有多少列。

library(tidyverse)

ex1_long <- pivot_longer(ex1, cols = a:d, names_to = "alpha")                                           

joined <- full_join(ex1_long, ex2, by = c("id" = "id", "alpha" = "alpha")) %>%
    mutate(value = coalesce(value, result)) %>% select(-result) %>%
    pivot_wider(id_cols = id, names_from = alpha, values_from = value)

joined
#> # A tibble: 20 x 5
#>       id     a     b     c     d
#>    <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1     1     1     1     0     0
#>  2     2     1     0     1     0
#>  3     3     1     1     0     0
#>  4     4     1     0     1     0
#>  5     5     1     1     0     0
#>  6     6     0     0     1     1
#>  7     7     0     1     0     1
#>  8     8     0     0     1     1
#>  9     9     0     1     0     1
#> 10    10     0     0     1     1
#> 11    11     0     1    NA     1
#> 12    12     0     1    NA     1
#> 13    13     0     1    NA     1
#> 14    14     0     1    NA    NA
#> 15    15     1     0    NA    NA
#> 16    16     1     1     0     1
#> 17    17     1     0     1     1
#> 18    18     0     1     1     0
#> 19    19     1     1     0     1
#> 20    20     1     0     1     1

Created on 2021-01-07 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2021 年 1 月 7 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM