简体   繁体   English

如何计算data.table中加权平均值的偏差?

[英]How to calculate deviations from weighted mean in data.table?

I would like to calculate deviations from (weighted) mean for many variables in a data.table . 我想计算data.table许多变量与(加权)均值的data.table

Let's take this example set: 我们来看这个例子:

mydt <- data.table(
    id = c(1, 2, 2, 3, 3, 3),
    x = 1:6,
    y = 6:1,
    w = rep(1:2, 3)
)

mydt
   id x y w
1:  1 1 6 1
2:  2 2 5 2
3:  2 3 4 1
4:  3 4 3 2
5:  3 5 2 1
6:  3 6 1 2

I can calculate the weighted means of x and y as follows: 我可以计算xy的加权平均值如下:

mydt[
    ,
    lapply(
        as.list(.SD)[c("x", "y")], 
        weighted.mean, w = w
    ),
    by = id
]

(I use the relatively complicated as.list(.SD)[...] construct instead of .SDcols because of this bug.) (因为这个 bug,我使用相对复杂的as.list(.SD)[...]构造而不是.SDcols 。)

I tried to first create the means for each row, but did not find how to combine := with lapply() . 我试图首先为每一行创建方法,但没有找到如何组合:=lapply()

Just tweak the weighted mean calculation a bit: 只需稍微调整加权平均值计算:

mydt[
    ,
    lapply(
        .SD[, .(x, y)], 
        function(var) var - weighted.mean(var, w = w)
    ),
    by = id
]

   id       x       y
1:  1  0.0000  0.0000
2:  2 -0.3333  0.3333
3:  2  0.6667 -0.6667
4:  3 -1.0000  1.0000
5:  3  0.0000  0.0000
6:  3  1.0000 -1.0000

The solution is updated by the suggested notational simplification of @DavidArenburg. 该解决方案由@DavidArenburg建议的符号简化更新。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R:data.table中的累积加权平均值 - R: Cumulative weighted mean in data.table 以R个不同的权重计算data.table中的加权平均值 - Calculating a weighted mean in data.table in R varying weights 如何计算大数据的行加权平均值? - How to calculate row weighted mean of big data? 如何计算 data.table 中前几行的移动平均值? - How to calculate moving average from previous rows in data.table? 如何在某些条件下使用data.table,使用R进行聚合来计算不同列的均值和中位数 - How to calculate mean and median of different columns under some conditions using data.table, aggregation with R dplyr/data.table:如何计算包含 R 中因子的两个向量每组观察计数的平均值 - dplyr/data.table: How to calculate the mean for counts of observations per group for two vectors containing factors in R 使用R中的data.table计算加权平均值,其中一个表列中的权重 - Calculating a weighted mean using data.table in R with weights in one of the table columns 如何计算data.table中的收益? - How to calculate return in data.table? 在 data.table apply() 中结合 rollapply() 和 weighted.mean() 用于多列 - Combining rollapply() and weighted.mean() in a data.table apply() for multiple columns 在 lapply() data.table 设置中组合 rollapply() 和 weighted.mean() 时出错 - Error when combining rollapply() and weighted.mean() in an lapply() data.table setting
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM