I know how to compute the sd using summarize:
ans <- temp%>% group_by(permno)%>% summarise(std = sd(ret)))
But how do I compute the standard deviation given I know the mean = 0?
In other words, I know the true mean and want to use that instead of using the sample mean while computing the sd.
One way would be to manually code the sd function, but I need it to work for each group, so I'm stuck.
It is always best to provide reproducible data. Here is an example with the iris
data set:
data(iris)
GM <- mean(iris$Sepal.Length) # "Population mean"
ans <- iris %>% group_by(Species) %>% summarise(std=sum((Sepal.Length - GM)^2)/length(Sepal.Length))
ans
# A tibble: 3 × 2
# Species std
# <fct> <dbl>
# 1 setosa 0.823
# 2 versicolor 0.270
# 3 virginica 0.951
As compared with computing the sd with each group mean:
ans <- iris %>% group_by(Species) %>% summarise(std=sd((Sepal.Length)))
ans
# A tibble: 3 × 2
# Species std
# <fct> <dbl>
# 1 setosa 0.352
# 2 versicolor 0.516
# 3 virginica 0.636
Note that sd
uses 'n - 1' in the denominator, but since you indicated that your mean was a population mean we use n
.
I came up with this solution:
sd_fn <- function(x, mean_pop) {
sd_f <- sqrt((sum((x-mean_pop)^2))/(length(x)))
sd_f
}
x <- c(1,2,3,-1,-1.5,-2.8)
mean_pop <- 0
sd_fn(x, mean_pop)
I simply created a function where the arguments are a numeric vector and the population mean that you already know... Simply enter the vector with data and mean population and the function will givr you thr desired standard deviation.
Hi if want to calculate the sd from a true mean i think you could do it by using the mean function on the square difference of sample vector and the true mean to calculate variance, then use sqrt to calculate the standart deviation. Keep in mind, that base R ' s var and sd functions have automatic bessels correction, you can read at https://www.r-bloggers.com/2018/11/how-to-de-bias-standard-deviation-estimates/
#Sample Size
n=1000
#sample Random Vec
universe = rnorm(n,0,3)
# sample mean
p = mean(universe)
p
# true mean
p0 = 0
# calculate "manually" using sample mean
variance <- mean((universe - p)^2)
variance
standard_deviation <- sqrt(variance)
standard_deviation
# calculate "manually" usingtrue mean
variance_true <- mean((universe - p0)^2)
variance_true
standard_deviation_true <- sqrt(variance_true)
standard_deviation_true
# calculate using built in R functions
var_r<-var(universe)
var_r
r_sd<-sd(universe)
r_sd
# They have automatic Bessels correction :
variance * n/(n-1) == var_r # Bessels correction using * n/(n-1)
r_sd == sqrt(variance * n/(n-1) )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.