简体   繁体   English

用NA之前和之后的平均情况替换NA

[英]Replace NA with average of the case before and after the NA

Say I have the following data.frame: 说我有以下data.frame:

t<-c(1,1,2,4,5,4)
u<-c(1,3,4,5,4,2)
v<-c(2,3,4,5,NA,2)
w<-c(NA,3,4,5,2,3)
x<-c(2,3,4,5,6,NA)

df<-data.frame(t,u,v,w,x)

I would like to replace the NAs with values that represent the average of the case before and after the NA, unless a row starts (row 4) or ends (row 5) with an NA. 我想用代表NA前后情况平均的值替换NA,除非行以NA开始(行4)或结束(行5)。 When the row begins with NA, I would like to substitute the NA with the following case. 当行以NA开头时,我想用以下情况替换NA。 When the row ends with NA, I would like to substitute the NA with the previous case. 当行以NA结尾时,我想用前面的情况替换NA。

Thus, I would like my output to look like: 因此,我希望我的输出看起来像:

t<-c(1,1,2,4,5,4)
u<-c(1,3,4,5,4,2)
v<-c(2,3,4,5,3.5,2)
w<-c(3,3,4,5,2,3)
x<-c(2,3,4,5,6,6)

df<-data.frame(t,u,v,w,x)

The question refers to row 4 starting with NA and row 5 ending in NA but in fact column 4 of the input df starts with an NA and column 5 of the input ends with an NA and neither row 4 nor row 5 of the input start or end with an NA so we will assume that column was meant, not row. 问题是第4行以NA开头,第5行以NA结尾,但实际上输入df的第4列以NA开头,输入的第5列以NA结束,输入的第4行或第5行都不是以NA结尾,因此我们假设列的意思是,而不是行。 Also there are two data frames both named df in the question. 在问题中也有两个数据帧都命名为df Evidently one is supposed to represent the input and the other data frame having the same name is the output but for complete clarity we have repeated the definition of the df we have used in the Note at the end. 显然,一个应该表示输入,另一个具有相同名称的数据帧是输出,但是为了完全清楚起见,我们在末尾的注释中重复了df的定义。

na.approx in zoo pretty much does this. 动物园中的na.approx几乎可以做到这一点。 (If a matrix result is OK then omit the data.frame() part.) (如果矩阵结果正确,则省略data.frame()部分。)

library(zoo)

data.frame(na.approx(df, rule = 2))

giving: 赠送:

  t u   v w x
1 1 1 2.0 3 2
2 1 3 3.0 3 3
3 2 4 4.0 4 4
4 4 5 5.0 5 5
5 5 4 3.5 2 6
6 4 2 2.0 3 6

Note: For clarity, we used this data frame as input above: 注意:为清楚起见,我们使用此数据框作为上面的输入:

df <- structure(list(t = c(1, 1, 2, 4, 5, 4), u = c(1, 3, 4, 5, 4, 
2), v = c(2, 3, 4, 5, NA, 2), w = c(NA, 3, 4, 5, 2, 3), x = c(2, 
3, 4, 5, 6, NA)), .Names = c("t", "u", "v", "w", "x"), row.names = c(NA, 
-6L), class = "data.frame")
sapply(df, function(x){
    replace(x, is.na(x), rowMeans(cbind(c(NA, head(x, -1)), c(x[-1], NA)), na.rm = TRUE)[is.na(x)])
})
#     t u   v w x
#[1,] 1 1 2.0 3 2
#[2,] 1 3 3.0 3 3
#[3,] 2 4 4.0 4 4
#[4,] 4 5 5.0 5 5
#[5,] 5 4 3.5 2 6
#[6,] 4 2 2.0 3 6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM