简体   繁体   中英

R: trying to calculate means and sd + warning that object cannot be found while the object is listed in the data frame header

I'm fairly new to R and I struggle with calculating means for a single column. RStudio returns the same warning on several different possibilities (described further down below). I have searched the existing questions, but either the questions did not ask for what I was searching for, or the solution did not work with my data.

My data has different studies as rows and study quality ratings with multiple sub-points as columns. A simplified version looks like this:

> dd <- data.frame(authoryear = c("Smith, 2020", "Meyer, 2019", "Lim, 2019", "Lowe, 2018"),
+                  stqu1 = c(1, 3, 2, 4), 
+                  stqu2 = c(8, 3, 9, 9),
+                  stqu3 = c(1, 1, 1, 2))
> dd
   authoryear stqu1 stqu2 stqu3
1 Smith, 2020     1     8     1
2 Meyer, 2019     3     3     1
3   Lim, 2019     2     9     1
4  Lowe, 2018     4     9     2

I calculated the sums of the study quality ratings for each study by rowSums and created a new column in my data frame called "stqu_sum". Like so:

dd$stqu_sum <- rowSums(subset(dd, select = c(stqu1, stqu2, stqu3)), na.rm = TRUE)

Now I would like to calculate the mean and standard deviation of stqu_sum over all the studies (rows). I googled and found many different ways to do this, but no matter what I try, I get the same warning which I don't know how to fix.

Things I have tried:

#defining stqu_sum as numeric
dd[, stqu_sum := as.numeric(stqu_sum)]

#colMeans
colMeans(dd, select = stqu_sum, na.rm = TRUE)
#sapply
sapply(dd, function(dd) c( "Stand dev" = sd(stqu_sum), 
                                           "Mean"= mean(stqu_sum,na.rm=TRUE),
                                           "n" = length(stqu_sum),
                                           "Median" = median(stqu_sum),
))

#data.table
dd[, .(mean_stqu = mean("stqu_sum"), sd_stqu = sd("stqu_sum")),.(variable, value)]

All of these have returned the warning: object stqu_sum not found. However the stqu_sum column is shown in the header of my data frame as seen above.

Can anyone help me to fix this or show me another way to do this? I hope this is detailed enough. Please let me know if I should add any information. Thank you in advance!

Is this what you're after? Mean and SD for stqu_sum:

dd_summary <- dd %>%
  summarise(mean=mean(stqu_sum),
            SD = sd(stqu_sum))

Gives:

> dd_summary
  mean       SD
1   11 3.366502

With data.table , we don't need to quote the column names

library(data.table)
dd[, .(mean_stqu = mean(stqu_sum), sd_stqu = sd(stqu_sum)),.(variable, value)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM