将R中的数据分组以执行功能

Question

Here is an example of my data: 这是我的数据的示例：

           id   score
1          82   0.50000
2          82   0.39286
3          82   0.56250
4         328   0.50000
5         328   0.67647
6         328   0.93750
7         328   0.91667

I want to make a column of moving average's of scores for each id. 我想为每个ID制作一列移动平均分数。

So I need to somehow group the data by id then apply a MA function to that grouped data and then have the output as another column "MA_score" 所以我需要以某种方式将数据按id分组，然后将MA函数应用于该分组数据，然后将输出作为另一列“ MA_score”

I would like my output to look like this: 我希望我的输出看起来像这样：

           id   score    MA_score
1          82   0.50000   NULL
2          82   0.39286   0.xxxx
3          82   0.56250   NULL
4         328   0.50000   NULL
5         328   0.67647   0.yyyy
6         328   0.93750   0.qqqq
7         328   0.91667   NULL

Answer 1

You could use split and rollapply from the zoo package as one of many ways to approach this. 您可以使用zoo包中的split和rollapply作为解决此问题的多种方法之一。 Note that in the example below I set the width of the rollapply function to 1 so it just returns each value. 请注意，在下面的示例中，我将rollapply函数的宽度设置为1，因此它仅返回每个值。 For widths greater than one it will take the mean of that number of values. 对于大于1的宽度，它将取该数量的平均值。

require(zoo)
sapply( split( df , df$id) , function(x) rollapply( x , width = 1 , align = 'left' , mean) )
#Note that by setting width = 1 we just return the value
$`82`
     id   score
[1,] 82 0.50000
[2,] 82 0.39286
[3,] 82 0.56250

$`328`
      id   score
[1,] 328 0.50000
[2,] 328 0.67647
[3,] 328 0.93750
[4,] 328 0.91667

If we were to set width = 3 you would get: 如果我们将width = 3设置width = 3 ，则将得到：

$`82`
     id   score
[1,] 82 0.48512

$`328`
      id     score
[1,] 328 0.7046567
[2,] 328 0.8435467

Or you could use aggregate in base R: 或者可以在base R中使用聚合：

aggregate(  score ~ id , data = df , function(x) rollapply( x , width = 1 , align = 'left' , mean)  )
   id                              score
1  82          0.50000, 0.39286, 0.56250
2 328 0.50000, 0.67647, 0.93750, 0.91667

There are quite a few ways to do this. 有很多方法可以做到这一点。 I would precisely define your moving average function though, because there are many ways to calculate it (check out for example TTR:::SMA ) 我会精确定义您的移动平均函数，因为有很多方法可以计算它（例如，请查看TTR:::SMA ）

Or more straightforward using ave : 或更简单的使用ave ：

within(df, { MA_score <- ave(score, id, FUN=function(x) 
                rollmean(x, k=3, na.pad = TRUE))})

Answer 2

You could split your data by unique ID values, calculate the rolling mean (from 'zoo' package) for each of these unique IDs and append the results to your initial dataframe: 您可以按唯一ID值拆分数据，为每个唯一ID计算滚动平均值（来自“ zoo”包），然后将结果附加到初始数据框中：

# Required packages
library(zoo)

# Data setup
df <- data.frame(id = c(82, 82, 82, 328, 328, 328, 328), 
                 score = c(0.5, 0.39286, 0.5625, 0.5, 0.67647, 0.9375, 0.91667))

# Split data by unique IDs
df.sp <- split(df, df$id)

# Calculate rolling mean for each unique ID
df.ma <- lapply(seq(df.sp), function(i) {
  rollmean(df.sp[[i]]$score, k = 3, na.pad = TRUE)
})

# Append column 'MA_score' to dataframe
for (i in seq(names(df.sp))) {
  df[which(df$id == names(df.sp)[i]), "MA_score"] <- df.ma[[i]]
}

df
   id   score  MA_score
1  82 0.50000        NA
2  82 0.39286 0.4851200
3  82 0.56250        NA
4 328 0.50000        NA
5 328 0.67647 0.7046567
6 328 0.93750 0.8435467
7 328 0.91667        NA

将R中的数据分组以执行功能

问题描述

2 个解决方案

解决方案1
4 已采纳 2013-04-16 13:11:35

解决方案2
2 2013-04-16 13:25:38

将R中的数据分组以执行功能

问题描述

2 个解决方案

解决方案1 4 已采纳 2013-04-16 13:11:35

解决方案2 2 2013-04-16 13:25:38

解决方案1
4 已采纳 2013-04-16 13:11:35

解决方案2
2 2013-04-16 13:25:38