简体   繁体   English

如何计算R中某些连续数据值的平均值

[英]How to Calculate average of certain consecutive data values in R

I have a data set for example 我有一个数据集,例如

a<-c(1,2,3,4,5,6,7,8,9)

I want to calculate average of every three consecutive data values. 我想计算每三个连续数据值的平均值。 Say, data values 比方说,数据值

1:3,4:6,7:9

What command should I use? 我应该使用什么命令?

This is another way: 这是另一种方式:

Make another vector that contains different levels for 1:3, 4:6, 7:9 制作另一个包含1:3,4:6,7:9不同级别的向量

a<-c(1,2,3,4,5,6,7,8,9)
b<-rep(1:3,each=3)
x<-ave(a, b, FUN=mean)  #use ave to find the means
x
#[1] 2 2 2 5 5 5 8 8 8  - gives this output

x[seq(1, length(x), 3)]  #this will output every 3rd element, giving:
#[1] 2 5 8

and if you wanted it on one row: 如果你想在一行上:

ave(a, rep(1:3,each=3), FUN=mean)[seq(1, length(a), 3)]

And an additional way - use some rolling mean function (eg from ZOO package or TTR package) and select the 3rd element of each: 还有一种方法 - 使用一些滚动平均函数(例如从ZOO包或TTR包中)并选择每个的第3个元素:

library(TTR)
runMean(a,3)[seq(3, length(a), 3)]
#[1] 2 5 8

and of course this principle could be extended to the base way of calculating rolling averages: 当然,这个原则可以扩展到计算滚动平均值的基本方法:

filter(a, rep(1/3,3), sides=1)[seq(3, length(a), 3)]

Here's a possible RcppRoll approach 这是一种可能的RcppRoll方法

library(RcppRoll)
n <- 3 # The summing range
a <- 1:9 # Your vector
roll_mean(a, n)[seq_len(length(a) - n + 1) %% n == 1]
## [1] 2 5 8

1) rollapply Try this: 1)rollapply试试这个:

library(zoo)
a <- 1:9
rollapply(a, 3, mean, by = 3, align = "left", partial = TRUE)
## [1] 2 5 8

It also works if the length of a is not a multiple of 3 in which case it still averages the small portion at the end. 它也可以如果长度a不为3的倍数这种情况下它仍然平均值在末尾小部分。 If you want any small portion at the end to be dropped then omit the partial=TRUE argument. 如果您希望删除末尾的任何小部分,则省略partial=TRUE参数。 If you know that the length of a is always a multiple of 3 then the partial = TRUE argument can be omitted since it has no effect in that case. 如果你知道a的长度总是3的倍数,那么可以省略partial = TRUE参数,因为它在那种情况下没有效果。

2) tapply Here is a second alternative approach. 2)tapply这是第二种替代方法。 c(gl(n, 3, n)) creates a grouping vector c(1, 1, 1, 2, 2, 2, ...)) of length n and then tapply applies mean to the values of a in each group: c(gl(n, 3, n))创建长度为n的分组向量c(1, 1, 1, 2, 2, 2, ...)) tapply c(1, 1, 1, 2, 2, 2, ...)) ,然后将mean应用于每组中a的值:

n <- length(a)
tapply(a, c(gl(n, 3, n)), mean)
## 1 2 3 
## 2 5 8 

3) aggregate Similar to tapply but gives a data frame as output: 3)聚合类似于tapply但是给出一个数据框作为输出:

n <- length(a)
group <- gl(n, 3, n)
aggregate(a ~ group, FUN = mean)
##   group a
## 1     1 2
## 2     2 5
## 3     3 8

This worked for me as well: 这对我也有用:

v  <- 1:9  # a given vector
gr <- 3    # consider a sequence of 3 consecutive elements
length(v) <- prod(dim(matrix(v, nrow=gr))) # will stretch the vector with NA-s if needed
colMeans(matrix(v, nrow=gr), na.rm=TRUE)
[1] 2 5 8

Need to pay attention to recycling when converting from vector to matrix. 从矢量转换为矩阵时需要注意回收。 For example: 例如:

v  <- 1:11
gr <- 3 
length(v) <- prod(dim(matrix(v, nrow=gr))); v
[1]  1  2  3  4  5  6  7  8  9 10 11 NA

# Will warn about the recycling 
# Warning message:
# In matrix(v, nrow = gr) :
#  data length [11] is not a sub-multiple or multiple of the number of rows [3]
# But the conversion will take place considering the NA-s:

m <- matrix(v, nrow=gr); m
     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   NA
colMeans(m, na.rm=TRUE)
[1]  2.0  5.0  8.0 10.5

An option with data.table data.table的选项

dt <- data.table(1:11, rep(1:3,each=3))
dt
    V1 V2
 1:  1  1
 2:  2  1
 3:  3  1
 4:  4  2
 5:  5  2
 6:  6  2
 7:  7  3
 8:  8  3
 9:  9  3
10: 10  1
11: 11  1
dt[, mean(V1), by = rleid(V2)]$V1
[1]  2.0  5.0  8.0 10.5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM