Replace NA's using data from Multiple Columns

Question

I have a data-frame that looks as such:

ID   col2  col3   col4 
1      5    NA    NA
2     NA    NA    1 
3      5    NA    NA
4     19    NA    1

If col2 has a value, that cell should not change (even if columns 3 and 4 contains values). However, if col2 contains an "NA" value, I would like to return any non-NA's from col3 or col4, if they exist.

Desired output shown below, notice how row 2 has the "1" there now.

ID   col2  col3   col4 
1      5    NA    NA
2      1    NA    1 
3      5    NA    NA
4     19    NA    1

I know this can be done manually by referencing each column using $ or [], but how can this be done using a for-loop or apply?

Thanks

Answer 1

We can do with ifelse

df1$col2 <- with(df1, ifelse(is.na(col2), pmax(col3, col4, na.rm = TRUE), col2))
df1$col2
#[1]  5  1  5 19

Or create a logical index to replace the values

i1 <- is.na(df1$col2)
df1$col2[i1] <- do.call(pmax, c(df1[i1, 3:4], na.rm = TRUE))

Replace NA's using data from Multiple Columns

Question

1 answers

solution1
1 ACCPTED 2017-01-12 03:29:57

Replace NA's using data from Multiple Columns

Question

1 answers

solution1 1 ACCPTED 2017-01-12 03:29:57

solution1
1 ACCPTED 2017-01-12 03:29:57