如何將冗余行折疊在一起以消除兩列中的鏡像 NA？

Question

我正在根據這個問題修改這個玩具 df，這與我的類似，但不同之處足以讓我感到有些困惑。

df <- data.frame(id1 = c("a" , "NA", "NA", "c"),
                 id2 = c(NA,"a","a",NA),
                 id3 = c("a", "a", "e", "e"),
                 n1 = c(2,2,3,3),
                 n2 = c(2,2,1,1),
                 n3 = c(0,0,3,3),
                 n4 = c(0,0,2,2))

這會產生一個如下所示的數據框：

id1 id2 id3 n1 n2 n3 n4
a   NA  a   2  2  0  0
NA  a   a   2  2  0  0
NA  a   e   3  1  3  2
c   NA  e   3  1  3  2

除了 id1 和 id2 之外，前兩行和后兩行是相同的。 我正在嘗試填寫空白以使它們完全相同，因此我可以應用 distinct() 使現在重復的行消失，從而產生如下數據框：

id1 id2 id3 n1 n2 n3 n4
a   a  a   2  2  0  0
c   a  e   3  1  3  2

有什么辦法可以做到這一點（最好是 tidyverse 解決方案）？ 我基本上是在嘗試折疊所有數據的冗余。

Answer 1

也許是這樣的？

df %>% 
  group_by(id3, n1, n2, n3, n4) %>% 
  summarise(id1 = na.omit(id1),
            id2 = na.omit(id2)) %>% 
  ungroup() %>% 
  select(id1,id2,id3,n1,n2,n3,n4)

輸出

# A tibble: 2 × 7
  id1   id2   id3   n1    n2    n3    n4   
  <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 a     a     a     2     2     0     0    
2 c     a     e     3     1     3     2

此解決方案非常適合此場景。 例如，如果每個組有多個 id1，它就行不通。

Answer 2

另一種可能的解決方案是，我首先創建了一個索引來分組：

df <- data.frame(id1 = c("a" , "NA", "NA", "c"),
                 id2 = c(NA,"a","a",NA),
                 id3 = c("a", "a", "e", "e"),
                 n1 = c(2,2,3,3),
                 n2 = c(2,2,1,1),
                 n3 = c(0,0,3,3),
                 n4 = c(0,0,2,2))

library(dplyr)
df %>%
  mutate(index = rep(seq_len(2), each=2)) %>%
  group_by(index) %>%
  arrange(id1) %>%
  summarise(across(everything(), funs(first(.[!is.na(.)])))) %>%
  select(-index)
#> # A tibble: 2 × 7
#>   id1_first id2_first id3_first n1_first n2_first n3_first n4_first
#>   <chr>     <chr>     <chr>        <dbl>    <dbl>    <dbl>    <dbl>
#> 1 a         a         a                2        2        0        0
#> 2 c         a         e                3        1        3        2

^{由reprex 包於 2022-07-09 創建 (v2.0.1)}

Answer 3

另一種可能的解決方案：

library(tidyverse)

df %>% 
  group_by(id3, across(n1:n4)) %>% 
  fill(id1:id2, .direction = "updown") %>% 
  ungroup %>% 
  distinct

#> # A tibble: 2 × 7
#>   id1   id2   id3      n1    n2    n3    n4
#>   <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 a     a     a         2     2     0     0
#> 2 c     a     e         3     1     3     2

如何將冗余行折疊在一起以消除兩列中的鏡像 NA？

問題描述

3 個解決方案

解決方案1
0 2022-07-09 06:54:53

解決方案2
0 2022-07-09 07:38:41

解決方案3
0 2022-07-09 11:46:07

如何將冗余行折疊在一起以消除兩列中的鏡像 NA？

問題描述

3 個解決方案

解決方案1 0 2022-07-09 06:54:53

解決方案2 0 2022-07-09 07:38:41

解決方案3 0 2022-07-09 11:46:07

解決方案1
0 2022-07-09 06:54:53

解決方案2
0 2022-07-09 07:38:41

解決方案3
0 2022-07-09 11:46:07