简体   繁体   English

从 R 中的 ggplot 生成统计摘要

[英]Generating Statistics Summary from a ggplot in R

I'm an R novice and working on project with script provided by my professor and I'm having trouble getting an accurate mean for my data that matches the box plot that I created.我是 R 的新手,正在使用教授提供的脚本进行项目工作,但我无法获得与我创建的框 plot 相匹配的数据的准确平均值。 The mean in this plot is below 300kg per stem and the mean I am getting when I use这个 plot 的平均值低于每根茎 300 千克,这是我使用时得到的平均值

ggsummarystats( DBHdata, x = "location", y = "biomassKeith_and_Camphor", ggfunc = ggboxplot, add = "jitter" )

or或者

tapply(DBHdata$biomassBrown_and_Camphor, DBHdata$location, mean)

I end up with means over 600 kg/stem.我最终平均超过 600 公斤/杆。 Is there way to produce summary statistics in the code for my box plot.有没有办法在我的盒子 plot 的代码中生成汇总统计信息。

Box and Whisker plot of kg per stem盒须 plot 千克/茎

The boxplots do not contain mean values, but median instead.箱线图不包含平均值,而是包含中位数。 So this could explain the variation you are observing in your calculations.所以这可以解释您在计算中观察到的变化。

Additionally, the data appears to be very skewed towards large numbers, so a mean of over 600 despite medians of ca 200 is not surpringing此外,数据似乎非常偏向于大数,因此尽管中位数约为 200,但平均值超过 600 并不令人惊讶

As others have pointed out, a boxplot shows the median per default.正如其他人指出的那样,箱线图显示了每个默认值的中位数。 If you want to get the mean with ggstatsplot, you can change the functions that you call with the summaries argument, as such:如果你想用 ggstatsplot 得到平均值,你可以改变你用 summaries 参数调用的函数,如下所示:

ggsummarystats(DBHdata, x = "location", y = "biomassKeith_and_Camphor",
ggfunc = ggboxplot, add = "jitter", summaries = c("n", "median", "iqr", "mean"))

This would add the mean besides the standard output of n, median, and interquartile range (iqr).这将在 n、中位数和四分位间距 (iqr) 的标准 output 之外添加平均值。

I'm not sure if I understand your question correctly, but first try calculating the group means with aggregate and then adding a text with means.我不确定我是否正确理解了您的问题,但首先尝试使用聚合计算组均值,然后添加带有均值的文本。

Sample code:示例代码:

means <- aggregate(weight ~  group, PlantGrowth, mean)

library(ggplot2)
    ggplot(PlantGrowth, aes(x=group, y=weight, fill=group)) + 
    geom_boxplot() +
      stat_summary(fun=mean, colour="darkred", geom="point", 
                   shape=18, size=3, show.legend=FALSE) + 
      geom_text(data = means, aes(label = weight, y = weight + 0.08))

Plot: Plot:

在此处输入图像描述

Sample data:样本数据:

data(PlantGrowth)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM