簡體   English   中英

合並r中不包含NA值的列

[英]Merge columns except NA values in r

我有這些數據文件:

dt.1

Father  Daughter
Peter     1
Josh      3
Cold      4
NA .      5
NA .      6
NA .      7

dt.2

Father  Weight
Peter     10
Josh      33
Cold      44
NA .      55
NA .      65
NA .      77

我想合並除NA值以外的其他值。 我需要這個:

Father       Weight    Daughter
    Peter     10         1
    Josh      33         2
    Cold      44         3
    NA .      55         NA
    NA .      65         NA
    NA .      77         NA
    NA        NA          5
    NA        NA          6
    NA        AN          7

我嘗試了普通合並:

new.data=merge(dt1,dt2, by="Father", all=T)

但是沒有用,新文件給了我更多的行。 因此,我想合並一下,只考慮實際價值。

單獨地, filter “父親”中不包含NA元素的數據集,執行full_join並將行與其他NA行綁定

library(tidyverse)
dt1 %>% 
  filter(is.na(Father)) %>%
  bind_rows(dt2 %>% 
                filter(is.na(Father))) %>%
  bind_rows(full_join(dt1 %>% 
                        filter(!is.na(Father)),
                      dt2 %>% filter(!is.na(Father))))%>% 
  arrange(is.na(Father), is.na(Weight)) %>% 
  select(Father, Weight, Daughter)
#   Father Weight Daughter
#1  Peter     10        1
#2   Josh     33        3
#3   Cold     44        4
#4   <NA>     55       NA
#5   <NA>     65       NA
#6   <NA>     77       NA
#7   <NA>     NA        5
#8   <NA>     NA        6
#9   <NA>     NA        7

或另一種選擇是通過NAs的存在進行split ,並加入邏輯條件

map2_df(split(dt1, is.na(dt1$Father)), split(dt2, is.na(dt2$Father)),
     ~ if(all(is.na(.x$Father))) bind_rows(.x, .y) else full_join(.x, .y))
#   Father Daughter Weight
#1  Peter        1     10
#2   Josh        3     33
#3   Cold        4     44
#4   <NA>        5     NA
#5   <NA>        6     NA
#6   <NA>        7     NA
#7   <NA>       NA     55
#8   <NA>       NA     65
#9   <NA>       NA     77

數據

dt1 <- structure(list(Father = c("Peter", "Josh", "Cold", NA, NA, NA
), Daughter = c(1L, 3L, 4L, 5L, 6L, 7L)), class = "data.frame", 
row.names = c(NA, 
-6L))

dt2 <- structure(list(Father = c("Peter", "Josh", "Cold", NA, NA, NA
), Weight = c(10L, 33L, 44L, 55L, 65L, 77L)), class = "data.frame",
row.names = c(NA, 
-6L))

使用dplyr和tidyr可以用占位符替換df1和df2中的NA ,加入數據框,然后將占位符轉換回NA

library(dplyr)
library(tidyr)

replace_na(df1, list(Father = "NA1")) %>% 
    full_join(replace_na(df2, list(Father = "NA2"))) %>% 
    mutate(Father = sub("NA.*", NA, Father))

#### OUTPUT ####

 Father Daughter Weight
1  Peter        1     10
2   Josh        3     33
3   Cold        4     44
4   <NA>        5     NA
5   <NA>        6     NA
6   <NA>        7     NA
7   <NA>       NA     55
8   <NA>       NA     65
9   <NA>       NA     77

使用基礎R可以先不合並數據框的部分NA s,則rbind與部分NA S:

df3 <- merge(subset(df1, !is.na(Father)), df2, by = "Father")
df1$Weight <- df2$Daughter <- NA
rbind(df_final, subset(df2, is.na(Father)), subset(df1, is.na(Father)))

#### OUTPUT ####

   Father Daughter Weight
1    Cold        4     44
2    Josh        3     33
3   Peter        1     10
4    <NA>       NA     55
5    <NA>       NA     65
6    <NA>       NA     77
41   <NA>        5     NA
51   <NA>        6     NA
61   <NA>        7     NA

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM