简体   繁体   中英

dealing with missing values with the R filter() function

I'd like to handle missing values using the filter() function in R.

In fact, I wish to compute X_t = 1/(2*T+1) * sum(X_i, i = (tT)...(t+T)) where (X_t) is a classical time series containing missing values. filter() computes sums over the time intervals [(tT);(t+T)] but it does not give the mean of the values excluding the NA s.

Does anyone have any idea how about dealing with that?

Try this:

library(zoo)
x <- 1:10
x[6] <- NA
rollapply(x, 3, mean, na.rm = TRUE)
## [1] 2.0 3.0 4.0 4.5 6.0 7.5 8.0 9.0

There are a variety of other arguments that you may or may not need depending on exactly what you want to get out. See ?rollapply .

REVISED Have updated answer based on more recent version of rollapply which allows simplification.

The sapply trick did not quite work for me. You have to manipulate the initial vector to get it to work with Ks larger than 1. Here is my code:

k <- 1  ## Moving average over three points.
x <- c(rep(1,5), NA, rep(1,5)) # input vector
stmp <- c( rep(NA,k), x, rep(NA,k) )
smooth <- sapply((k+1):(k+length(x)), function(i){mean(x[(i-k):(i+k)], na.rm=TRUE)})

I also added a function statement so the code runs without error. Hope it helps :)

If you want a simple moving average over 2k+1 points, you can do this:

x <- c(rep(1,5), NA, rep(1,5))
k <- 1  ## Moving average over three points.
smooth <- sapply(1:length(x), mean(x[(i-k):(i+k)], na.rm=TRUE))

which results in a vector of all ones in this case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM