解释ggplot2中的“stat_summary = mean_cl_boot”？

Question

a perhaps simple question I tried to make an errorgraph like the one shown in page 532 of Field's "Discovering Statistics Using R". 一个或许简单的问题，我试图制作一个错误图，就像Field的“使用R发现统计数据”第532页所示。

The code can be found here http://www.sagepub.com/dsur/study/DSUR%20R%20Script%20Files/Chapter%2012%20DSUR%20GLM3.R : 代码可以在这里找到http://www.sagepub.com/dsur/study/DSUR%20R%20Script%20Files/Chapter%2012%20DSUR%20GLM3.R :

line <- ggplot(gogglesData, aes(alcohol, attractiveness, colour = gender))
line + stat_summary(fun.y = mean, geom = "point") + 
stat_summary(fun.y = mean, geom = "line", aes(group= gender)) + 
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2) + 
labs(x = "Alcohol Consumption", y = "Mean Attractiveness of Date (%)", colour = "Gender")

I produced the same graph; 我制作了相同的图表; my y-axis variable has only 4-points (it is a discrete scale, 1-4), now the y-axis has the points 1.5, 2, 2.5 in which the lines vary. 我的y轴变量只有4个点（它是一个离散的刻度，1-4），现在y轴有点1.5,2,2.5，其中线条变化。

And the question is: what do these points and graphs describe? 问题是：这些点和图表描述了什么？ I assume that the important part is stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2) are they count of observations for that group and that level(x-axis)? 我假设重要的部分是stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2)它们是对该组和该级别（x轴）的观察计数？ Are they frequencies? 它们是频率吗？ Or, are they proportions? 或者，它们的比例是多少？

I found this http://docs.ggplot2.org/0.9.3/stat_summary.html but it did not help me 我找到了这个http://docs.ggplot2.org/0.9.3/stat_summary.html，但它没有帮助我

Thank you 谢谢

Answer 1

Here is what the ggplot2 book on page 83 says about mean_cl_boot() 以下是第83页的ggplot2 书中有关mean_cl_boot()

Function          Hmisc original        Middle Range
mean_cl_boot() smean.cl.boot() Mean Standard error from bootstrap

I think that it is the smean.cl.boot() from Hmisc package but renamed as mean.cl.boot() in ggplot2. 我认为它是来自Hmisc包的smean.cl.boot() ，但在ggplot2中重命名为mean.cl.boot() 。

and here is the definition of original function from Hmisc package : 这里是Hmisc包中原始函数的定义：

smean.cl.boot is a very fast implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality smean.cl.boot是基本非参数自举的非常快速的实现，用于获得总体均值的置信限，而不假设正态性

Answer 2

I reproduced the graph using your code and I get essentially the same graph shown in Field's book, Discovering Statistics Using R, figure 12.12, page 532, except for the ordering of the variables on the x axis. 我使用你的代码重现了这个图，我得到的字段基本上是字段的书“使用R发现统计数据”，图12.12，第532页，除了x轴上变量的排序。 The y axis displays the continuous variable, Mean Attractiveness of Date (%). y轴显示连续变量，日期的平均吸引力（％）。 The 95% confidence intervals, created--as you point out--with the stat_summary() function and the mean_cl_boot argument are bootstrap confidence intervals using the smean.cl.boot() function in Hmisc, as pointed out by another commenter above. 使用stat_summary（）函数和mean_cl_boot参数创建的95％置信区间是使用hmisc中的smean.cl.boot（）函数创建的自举置信区间，正如上面另一位评论者所指出的那样。 This function is described on page 262 of the Hmisc documentation . Hmisc 文档的第262页描述了此功能。 The ggplot2 documentation on mean_cl_boot is sparse and defers to the description in the Hmisc package. 关于mean_cl_boot的ggplot2 文档是稀疏的，并且遵循 Hmisc包中的描述。

Note that the arguments to mean_cl_boot in ggplot2 are the same as those in the smean.cl.boot function in the Hmisc package. 请注意，ggplot2中mean_cl_boot的参数与Hmisc包中的smean.cl.boot函数中的参数相同。 You can change the desired confidence level from the default of .95 by using the conf.int argument and the number of bootstrap samples by using the B argument. 您可以使用conf.int参数和使用B参数的bootstrap样本数来更改默认值.95所需的置信度。 Here, for example, is the code for creating the same plot with a 99% confidence interval and 5000 bootstrap samples: 例如，这里是用于创建具有99％置信区间和5000个引导样本的相同图的代码：

line <- ggplot(gogglesData, aes(alcohol, attractiveness, colour = gender))
line + stat_summary(fun.y = mean, geom = "point") + 
stat_summary(fun.y = mean, geom = "line", aes(group= gender)) + 
stat_summary(fun.data = mean_cl_boot, conf.int = .99, B = 5000, geom = "errorbar", width = 0.2) + 
labs(x = "Alcohol Consumption", y = "Mean Attractiveness of Date (%)", colour = "Gender")

解释ggplot2中的“stat_summary = mean_cl_boot”？

问题描述

2 个解决方案

解决方案1
13 已采纳 2013-07-01 23:09:13

解决方案2
1 2013-12-25 20:33:22

解释ggplot2中的“stat_summary = mean_cl_boot”？

问题描述

2 个解决方案

解决方案1 13 已采纳 2013-07-01 23:09:13

解决方案2 1 2013-12-25 20:33:22

解决方案1
13 已采纳 2013-07-01 23:09:13

解决方案2
1 2013-12-25 20:33:22