Replacing NA in longitudinal data with average difference of non-missing values

Question

Here is a simplified version of the data I am working with:

 data.frame(country = c("country1", "country2", "country3", "country1", "country2"), measurement = c("m1", "m1", "m1", "m2", "m2"), y2015 = c(NA, 15, 19, 13, 55), y2016 = c(NA, 17, NA, 10, NA), y2017 = c(14, NA, NA, 9, 45), y2018 = c(18, 22, 16, NA, 40))

I am trying to take the difference between the two non-missing variables on either side of the NAs, and replace the missing values with the average of the differences over time.

For row 5, this would be something like c(55, 50 , 45, 40 ).

However, it also needs to work for the rows that have more than one missing value in a sequence, like row 1 and row 3. For row 1, I'd like the difference between 14 and 18 to be interpolated, and so it should look something like c( 6 , 10 , 14, 18). Meanwhile, for row 3, the difference between 19-13 divided between the two missing years, to look something like c(19, 18 , 17 , 16).

Essentially, I'm looking to create a slope for each country and measurement through the available years, and interpolating missing variables based on that.

I am trying to think of a package for this or perhaps create a loop. I have looked at the package 'spline' but does not seem to work since I want to run separate linear interpolation based on country and measurement.

Any thoughts would be greatly appreciated!

Answer 1

Use zoo::na.spline :

library(zoo)
dat[-c(1:2)] <- t(na.spline(t(dat[-c(1:2)])))

   country measurement y2015 y2016    y2017 y2018
1 country1          m1     6    10 14.00000    18
2 country2          m1    15    17 19.33333    22
3 country3          m1    19    18 17.00000    16
4 country1          m2    13    10  9.00000    10
5 country2          m2    55    50 45.00000    40

Replacing NA in longitudinal data with average difference of non-missing values

Question

1 answers

solution1
0 2022-08-09 08:49:22

Replacing NA in longitudinal data with average difference of non-missing values

Question

1 answers

solution1 0 2022-08-09 08:49:22

solution1
0 2022-08-09 08:49:22