I am a beginner in R. I am trying to calculate the between group variance using the following code.
calcBetweenGroupsVariance <- function(variable,groupvariable)
{
# find out how many values the group variable can take
groupvariable2 <- as.factor(groupvariable[[1]])
levels <- levels(groupvariable2)
numlevels <- length(levels)
# calculate the overall grand mean:
grandmean <- mean(variable)
# get the mean and standard deviation for each group:
numtotal <- 0
denomtotal <- 0
for (i in 1:numlevels)
{
leveli <- levels[i]
levelidata <- variable[groupvariable==leveli,]
levelilength <- length(levelidata)
# get the mean and standard deviation for group i:
meani <- mean(levelidata)
sdi <- sd(levelidata)
numi <- levelilength * ((meani - grandmean)^2)
denomi <- levelilength
numtotal <- numtotal + numi
denomtotal <- denomtotal + denomi
}
# calculate the between-groups variance
Vb <- numtotal / (numlevels - 1)
Vb <- Vb[[1]]
return(Vb)
}
However, I am getting the following error while using this function,
calcBetweenGroupsVariance (data[3],data[2])
Warning message: In mean.default(variable) : argument is not numeric or logical: returning NA
I understand something is going wrong while using the mean function.
Here is the output of str(data)
'data.frame': 45 obs. of 11 variables:
$ V1 : int 2 3 3 2 3 2 2 2 3 2 ...
$ V2 : num 1.3243 -2.4546 0.1352 0.0676 -1.1901 ...
$ V3 : num 0.913 -2.644 0.663 1.217 -0.409 ...
$ V4 : num -1.863 1.965 -0.698 -0.945 0.617 ...
$ V5 : num -0.574 1.031 -0.308 -0.574 0.354 ...
$ V6 : num -0.8963 2.5702 0.0736 -1.3671 0.9045 ...
$ V7 : num 0.2276 0.0624 0.5945 0.6194 0.5473 ...
$ V8 : num 1.304 -1.624 0.408 0.368 -0.559 ...
$ V9 : num -0.1827 -0.9748 -0.5158 -0.0191 -0.3053 ...
$ V10: num -0.964 0.67 -0.12 0.789 0.711 ...
$ V11: num -0.833 -0.833 -0.833 -0.0539 -0.0539 ...
Kindly suggest how to get rid of this error.
Thanks and regards
There are mutliple errors in your script related to dimensions of the arrays and the difference bewteen a vector and a list
Let's assume the arguments variable, groupvariable
of your function should be vectors / 1d-arrays.
The line groupvariable2 <- as.factor(groupvariable[[1]])
should be groupvariable2 <- as.factor(groupvariable)
because groupvariable
is not a list and youa re not just interested in the first element but in all.
The line levelidata <- variable[groupvariable==leveli,]
should be levelidata <- variable[groupvariable==leveli]
because variable
has only one dimension (it is not a matrix)
The call to your function should be calcBetweenGroupsVariance(data[[3]], data[[2]])
(with double brackets [[]]
)) or alternatively calcBetweenGroupsVariance(data[, 3],data[, 2])
or you will pass a list instead of a vector to the function.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.