简体   繁体   中英

replacing NA with value in adjacent column in R

I want to replace the NA in the IMIAVG column with the value in the IMILEFT or IMIRIGHT column in the same row when necessary (ie Row 1, 6, 7). I've tried multiple things but nothing seems to work. Does this need a loop? Please note errors keep coming up with atomic vectors. Thx!

  IMILEFT       IMIRIGHT       IMIAVG
  NA            71.15127         NA
  72.18310      72.86607      72.52458
  70.61460      68.00766      69.31113
  69.39032      69.91261      69.65146
  72.58609      72.75168      72.66888
  70.85714         NA            NA
  NA            69.88203         NA
  74.47109      73.07963      73.77536
  70.44855      71.28647      70.86751
  NA            72.33503         NA
  69.82818      70.45144      70.13981
  68.66929      69.79866      69.23397
  72.46879      71.50685      71.98782
  71.11888      71.98336      71.55112
  NA            67.86667         NA

If only one value is not NA amongst IMILEFT and IMIRIGHT (as in your example), just try ( df is your data.frame):

indx<-is.na(df$IMIAVG)
df$IMIAVG[indx]<-rowSums(df[indx,1:2],na.rm=TRUE)

Btw, if you want to find the mean value of each row and exclude the NA values in the process, you can set the na.rm argument as TRUE in the function rowMeans . I guess that you can obtain your last column as just:

rowMeans(df[,1:2],na.rm=TRUE)

to remove the problem at its root.

Data

df<-structure(list(IMILEFT = c(NA, 72.1831, 70.6146, 69.39032, 72.58609, 
70.85714, NA, 74.47109, 70.44855, NA, 69.82818, 68.66929, 72.46879, 
71.11888, NA), IMIRIGHT = c(71.15127, 72.86607, 68.00766, 69.91261, 
72.75168, NA, 69.88203, 73.07963, 71.28647, 72.33503, 70.45144, 
69.79866, 71.50685, 71.98336, 67.86667), IMIAVG = c(NA, 72.52458, 
69.31113, 69.65146, 72.66888, NA, NA, 73.77536, 70.86751, NA, 
70.13981, 69.23397, 71.98782, 71.55112, NA)), .Names = c("IMILEFT", 
"IMIRIGHT", "IMIAVG"), class = "data.frame", row.names = c(NA, 
-15L))

You could also use pmax

indx <- is.na(df$IMIAVG)
df$IMIAVG[indx] <- do.call(pmax, c(df[indx, 1:2], na.rm=TRUE))

Or using data.table

library(data.table) 
setDT(df)[is.na(IMIAVG), IMIAVG:=pmax(IMILEFT, IMIRIGHT, na.rm=TRUE)]
df <- read.table(text = "IMILEFT       IMIRIGHT       IMIAVG
  NA            71.15127         NA
  72.18310      72.86607      72.52458
  70.61460      68.00766      69.31113
  69.39032      69.91261      69.65146
  72.58609      72.75168      72.66888
  70.85714         NA            NA
  NA            69.88203         NA
  74.47109      73.07963      73.77536
  70.44855      71.28647      70.86751
  NA            72.33503         NA
  69.82818      70.45144      70.13981
  68.66929      69.79866      69.23397
  72.46879      71.50685      71.98782
  71.11888      71.98336      71.55112
  NA            67.86667         NA" , header = T)

library("dplyr")

    df %>%
  mutate(
    IMIAVG = ifelse(
                      is.na(IMIAVG) , 
                      ifelse(is.na(IMIRIGHT) ,IMILEFT ,IMIRIGHT  ) , 
                      IMIAVG
                   )
         )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM