简体   繁体   中英

Creating a new columns based on values from existing multiple columns

I need to create a new column named "condition" (which is not there initially) based on the first three columns. If the values are from cond1 then it should be 1 in my condition column and so on. Any suggestions.

cond_test = read.csv("https://www.dropbox.com/s/du76g4vlfz2uaph/cond_test.csv?dl=1")
cond_test
#>   ï..cond1 cond2 cond3 condition
#> 1        2    NA    NA         1
#> 2        4    NA    NA         1
#> 3       NA     3    NA         2
#> 4       NA     5    NA         2
#> 5       NA     4    NA         2
#> 6       NA    NA     1         3
#> 7       NA    NA     4         3
#> 8       NA    NA     7         3

You can use max.col to get first non-NA value in each row.

max.col(!is.na(cond_test))
#[1] 1 1 2 2 2 3 3 3

If you have more than one non-NA value in the row you can look at ties.method argument in ?max.col on how to handle ties.


In dplyr you can use rowwise :

library(dplyr)
cond_test %>%
  rowwise() %>%
  mutate(condition = which.max(!is.na(c_across())))

I tried the following code and is working. But any elegant solutions are welcome.

cond_test$condition = ifelse(!is.na(cond_test$ï..cond1), 1, 
                             ifelse(!is.na(cond_test$cond2), 2, 3))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM