简体   繁体   English

从截断的正态分布中提取图形会导致R中的标准偏差错误

[英]Drawing from truncated normal distribution delivers wrong standard deviation in R

I draw random numbers from a truncated normal distribution. 我从截断的正态分布中得出随机数。 The truncated normal distribution is supposed to have mean 100 and standard deviation 60 after truncation at 0 from the left. 截断后的正态分布在从左0开始截断后应该具有均值100和标准差60。 I computed an algorithm to compute the mean and sd of the normal distribution prior to the truncation (mean_old and sd_old). 我计算了一种算法来计算截断前的正态分布的均值和sd(mean_old和sd_old)。 The function vtruncnorm gives me the (wanted) variance of 60^2. 函数vtruncnorm给了我(想要的)方差60 ^ 2。 However, when I draw random variables from the distribution, the standard deviation is around 96. I don't understand why the sd of the random variables varies from the computation of 60. 但是,当我从分布中得出随机变量时,标准偏差大约为96。我不理解为什么随机变量的sd不同于60的计算。

I tried increasing the amount of draws - still results in sd around 96. 我尝试增加抽奖的数量-仍然导致SD保持在96附近。

 require(truncnorm)
 mean_old = -5425.078
 sd_old = 745.7254
 val = rtruncnorm(10000, a=0,  mean = mean_old, sd = sd_old)
 sd(val)
 sqrt(vtruncnorm( a=0,  mean = mean_old, sd = sd_old))

Ok, I did quick test 好,我做了快速测试

require(truncnorm)

val = rtruncnorm(1000000, a=7.2,  mean = 0.0, sd = 1.0)
sd(val)
sqrt(vtruncnorm( a=7.2,  mean = 0.0, sd = 1.0))

Canonical truncated gaussian. 典范截断高斯。 At a=6 they are very close, 0.1554233 vs 0.1548865 fe, depending on seed etc. At a = 7 they are systematically different, 0.1358143 vs 0.1428084 (sampled value is smaller that function call). 在a = 6时,它们非常接近,即0.1554233 vs.0.1548865 fe,取决于种子等。在a = 7时,它们在系统上是不同的,即0.1358143 vs.0.1428084(采样值小于该函数调用的值)。 I've checked with Python implementation 我已经检查过Python的实现

import numpy as np
from scipy.stats import truncnorm

a, b = 7.0, 100.0

mean, var, skew, kurt = truncnorm.stats(a, b, moments='mvsk')

print(np.sqrt(var))

r = truncnorm.rvs(a, b, size=100000)
print(np.sqrt(np.var(r)))

and got back 0.1428083662823426 which is consistent with R vtruncnorm result. 并返回0.1428083662823426,这与R vtruncnorm结果一致。 At your a=7.2 or so results are even worse. 在a = 7.2左右时,结果甚至更糟。

Moral of the story - at high a values sampling from rtruncnorm has a bug. 这个故事的寓意-从rtruncnorm采样的a高值有一个错误。 Python has the same problem as well. Python也有同样的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Hydenet r package:setnode() 模拟正态分布的错误标准差 - Hydenet r package: setnode() simulates wrong standard deviation for normal distribution R中的STAN:正态分布中的标准偏差或方差 - STAN in R: Standard deviation or variance in normal distribution 如何在R中绘制具有两个标准差的正态分布图 - How to draw normal distribution graph with two standard deviation in R 从均值5和标准差3的正态分布模拟5000个大小为5的样本 - Simulate 5000 samples of size 5 from a normal distribution with mean 5 and standard deviation 3 对数正态分布的几何标准偏差 - geometric standard deviation for a log normal distribution 正态分布和对数正态分布的均值和标准差 - Mean and standard deviation in normal distribution and log-normal distribution 使用 R 中的伽马分布估计均值和置信区间的标准差 - Estimating the standard deviation from mean and confidence intervals with a gamma distribution in R 从具有精确平均值的截断正态分布生成数据,并且在R中生成sd - Generate data from truncated normal distribution with exact mean and sd in R 用“最优”函数估计R.problem中正态分布的均值和标准差 - Estimation of mean and standard deviation of a normal distribution in R.problem with the “optim” function 使用R编写一个截断的正态分布函数 - Write a truncated normal distribution function using R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM