简体   繁体   中英

Combine values in two columns together based specific conditions in R

I have data that looks like the following:

moo <- data.frame(Farm = c("A","B",NA,NA,"A","B"), 
                  Barn_Yard = c("A","A",NA,"A",NA,"B"), 
                  stringsAsFactors=FALSE)
print(moo)
Farm Barn_Yard
 A         A
 B         A
<NA>      <NA>
<NA>       A
 A       <NA>
 B         B

I am attempting to combine the columns into one variable where if they are the same the results yields what is found in both columns, if both have data the result is what is in the Farm column, if both are <NA> the result is <NA> , and if one has a value and the other doesn't the result is the value present in the column that has the value. Thus, in this instance the result would be:

oink <- data.frame(Animal_House = c("A","B",NA,"A","A","B"), 
                   stringsAsFactors = FALSE)

print(oink)
Animal_House
        A
        B
     <NA>
        A
        A
        B

I have tried the unite function from tidyr but it doesn't give me exactly what I want. Any thoughts? Thanks!

dplyr::coalesce does exactly that, substituting any NA values in the first vector with the value from the second:

library(dplyr)

moo <- data.frame(Farm = c("A","B",NA,NA,"A","B"), 
                  Barn_Yard = c("A","A",NA,"A",NA,"B"), 
                  stringsAsFactors = FALSE)

oink <- moo %>% mutate(Animal_House = coalesce(Farm, Barn_Yard))

oink
#>   Farm Barn_Yard Animal_House
#> 1    A         A            A
#> 2    B         A            B
#> 3 <NA>      <NA>         <NA>
#> 4 <NA>         A            A
#> 5    A      <NA>            A
#> 6    B         B            B

If you want to discard the original columns, use transmute instead of mutate .

A less succinct option is to use a couple ifelse() statements, but this could be useful if you wish to introduce another condition or column into the mix.

moo <- data.frame(Farm = c("A","B",NA,NA,"A","B"), 
                  Barn_Yard = c("A","A",NA,"A",NA,"B"), 
                  stringsAsFactors = FALSE)

moo$Animal_House = with(moo,ifelse(is.na(Farm) & is.na(Barn_Yard),NA,
                                   ifelse(!is.na(Barn_Yard) & is.na(Farm),Barn_Yard,
                                          Farm)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM