简体   繁体   English

R diff()处理不适用

[英]R diff() handling NA

I would like to calculate the first difference in a variable if either the current value or the lag value is missing. 如果当前值或滞后值丢失,我想计算变量的第一个差异。 The R diff() function returns NA if either value is missing. 如果缺少任何一个值,R diff()函数将返回NA。 Can this behavior be changed? 这种行为可以改变吗?

data <- c(5, NA, NA, 10, 25)

diff_i_want <- c(-5, NA, 10, 15)

diff_i_get <- diff(data)

identical(diff_i_want, diff_i_get)

you can replace NA 's by zeros: 您可以将NA替换为零:

x <- c(5, NA, NA, 10, 25)
> diff("[<-"(x, is.na(x), 0))
[1] -5  0 10 15

Admittedly, this is different from your diff_i_want ... but I'm not sure of your logic. 诚然,这与您的diff_i_want ...不同,但我不确定您的逻辑。 How do you get -5 as the first element of your answer? 您如何获得-5作为答案的第一要素? Why -5 ? 为什么是-5 The only way to get there is to implicitly replace NA by zero. 到达那里的唯一方法是将NA隐式替换为零。 So if you do this replacement there, why don't you replace the next element? 因此,如果在那里进行替换,为什么不替换下一个元素?

Though your desired answer doesn't make much sense to me, it is possible to obtain it eg using zoo::rollapply : 尽管您想要的答案对我来说没有多大意义,但是可以使用例如zoo::rollapply获得它:

# first define a function that takes a vector of length 2
# ... and will output the difference if no more than 1 of the values is missing
weirddiff <- function(x) {
  if(any(is.na(x)) && !all(is.na(x))) x[is.na(x)] <- 0
  x[2] - x[1]
}

Now we can use rollapply with the window set to 2 : 现在,我们可以在窗口设置为2使用rollapply

library(zoo)
rollapply(x,2,weirddiff)
[1] -5 NA 10 15

Here is a way: 这是一种方法:

data <- c(5, NA, NA, 10, 25)
data2 = data
data2[is.na(data2)] = 0
diffData2 = diff(data2)
diffData2[diff(is.na(data))==0 & is.na(data[-1])] = NA

> diffData2
[1] -5 NA 10 15

First make a copy the data to data2, set all NAs to 0 and then diff. 首先将数据复制到data2,将所有NA设置为0,然后进行diff。 At the last step put back all NAs into the calculated diff. 在最后一步,将所有NA放回计算的差异中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM