简体   繁体   English

R 具有高斯分布的加权平均值

[英]R weighted mean with gaussian distribution

Limited stats education, so possibly am trying to define something that is a simple function, or why I am unable to find an existing answer有限的统计教育,所以可能正在尝试定义一些简单的 function,或者为什么我找不到现有的答案

The problem is to calculate a weighted mean for a time series, with greater weight to the most recent data.问题是计算时间序列的加权平均值,对最新数据具有更大的权重。 The weighting should follow "one side of a gaussian curve" function ("S" curve?), highest value starting from the most recent (last) point.权重应遵循“高斯曲线的一侧”function(“S”曲线?),从最近(最后)点开始的最高值。 I realise there would be a couple coefficients to define the gradient of the curve, but assume "normal"我意识到会有几个系数来定义曲线的梯度,但假设“正常”

Weighting Points along an 'S' curve in R R 中沿“S”曲线的加权点

This seems to be asking the same question, but the only answer is a bit over-engineered for what I am looking for这似乎在问同样的问题,但唯一的答案对于我正在寻找的东西来说有点过度设计

I can generate a linear weighted average as follows我可以生成一个线性加权平均值如下

# time series data
d <- c(7, 8, 10, 7, 8, 11, 9, 6, 13, 10, 11, 11)
# weight coefficients
w <- seq(1, length(d), 1)
w <- w / sum(w)
w
[1] 0.01282051 0.02564103 0.03846154 0.05128205 0.06410256 0.07692308 0.08974359 0.10256410
[9] 0.11538462 0.12820513 0.14102564 0.15384615
weighted.mean(d, w, na.rm = T)
[1] 9.846154

How do I use a "gaussian sequence" for w instead of my linear one?如何使用w的“高斯序列”而不是线性序列?

You can get weights that follow the left-hand side of a normal distribution like this:您可以获得遵循正态分布左侧的权重,如下所示:

w <- dnorm(seq(-3, 0, length = length(d)))
w <- w / sum(w)

So the weights look something like this:所以权重看起来像这样:

plot(w)

在此处输入图像描述

If you want the curve to be steeper, you can set the sd argument of dnorm to less than 1, and if you want it more gradual, increase its value.如果希望曲线更陡峭,可以将dnormsd参数设置为小于 1,如果希望曲线更平缓,则增加其值。 At the moment the example shows the default, with sd = 1.目前示例显示默认值,sd = 1。


EDIT编辑

An alternative that might allow for better control would be a logistic curve:一种可能允许更好控制的替代方法是逻辑曲线:

w <- plogis(seq(-1, 1, length = length(d)), scale = 0.3)
w <- w / sum(w)
plot(w)

在此处输入图像描述

w <- plogis(seq(-1, 1, length = length(d)), scale = 0.15)
w <- w / sum(w)
plot(w)

在此处输入图像描述

To run a weighted mean along a time series, I would recommend using convolve for efficiency reason, rather than trying to reimplement it.要沿时间序列运行加权平均值,出于效率原因,我建议使用convolve ,而不是尝试重新实现它。 For instance:例如:

d <- c(7, 8, 10, 7, 8, 11, 9, 6, 13, 10, 11, 11)

k <- dnorm(seq(-2,2, length.out = 5))

convolve(d, k/sum(k), type = "filter")


[1]  9.466427  7.427122  8.213693 10.465371  8.894341  7.066883 11.933909
[8] 10.425011

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM