I have the following data frame,
df <- data.frame(id = c("a", "a", "a", "a", "b", "b", "b", "b"),
time = 1:4, value = c(100, NA, NA, 550, 300, NA, NA, 900))
Can someone suggest an approach for replacing the NA values in df by dividing the difference of the value column evenly over time? At time 1, A is 100 and at time 4 A is 550. How would one change the NAs in times 2 and 3 to 250 and 400? And then 500 and 700 for B at times 2 and 3?
I can write a complex for loop to brute force it, but is there a more efficient solution?
You could use na.approx
from zoo
library(zoo)
df$value <- na.approx(df$value)
df
# id time value
#1 a 1 100
#2 a 2 250
#3 a 3 400
#4 a 4 550
#5 b 1 300
#6 b 2 500
#7 b 3 700
#8 b 4 900
Or you can create your own vectorized version of na.approx
without any complicated loops and solve it without any external packages
myna.approx <- function(x){
len <- length(x)
cumsum(c(x[1L], rep((x[len] - x[1L])/(len - 1L), len - 1L)))
}
with(df, ave(value, id, FUN = myna.approx))
## [1] 100 250 400 550 300 500 700 900
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.