[英]How to plot, in the same graph, the histogram and the frequency polygon of two sets of data with ggplot2 in R
I have two sets of data and I would like to get a single graph with the histogram and frequency polygon for each set of data. 我有两组数据,我想为每组数据获取一个带有直方图和频率多边形的图形。
My data frame df
is like this one: 我的数据框df
就是这样的:
'data.frame': 20000 obs. of 2 variables:
$ measure : num -0.566 0.321 0.125 1.353 -1.288 ...
$ processing: Factor w/ 2 levels "before","after": 1 1 1 1 1 1 1 1 1 1 ...
measure processing
1 -0.5656801 before
2 0.3210458 before
3 0.1252706 before
4 1.3532248 before
5 -1.2877305 before
6 0.3225545 before
My code is the following: 我的代码如下:
png("figure_%d.png")
set.seed(2014)
n <- 10000
before <- rnorm(n)
df_1 <- data.frame(measure=before)
df_1$processing <- factor("before")
after <- before-rnorm(n,mean=1,sd=0.1)
df_2 <- data.frame(measure=after)
df_2$processing <- factor("after")
df<-rbind(df_1,df_2)
library(ggplot2)
print(ggplot(df, aes(measure,colour=processing))+geom_freqpoly())
print(ggplot(df, aes(measure,fill=processing))+geom_density(alpha=0.5))
print(ggplot(df_1, aes(measure,fill=processing))+geom_histogram(alpha=0.5))
print(ggplot(df_2, aes(measure,fill=processing))+geom_histogram(alpha=0.5))
print(ggplot(df, aes(measure,fill=processing))+geom_histogram(alpha=0.5))
print(ggplot(df, aes(measure,fill=processing,colour=processing))+geom_freqpoly()+geom_histogram(alpha=0.5))
Now, after 现在,之后
ggplot(df, aes(measure,colour=processing))+geom_freqpoly()
I get the following figure 我得到下图
where the two frequency polygon are as expected. 其中两个频率多边形符合预期。
After 后
ggplot(df, aes(measure,fill=processing))+geom_density(alpha=0.5)
I get the following figure 我得到下图
and where the two densities overlap I get the expected "blended" color. 在两个密度重叠的地方,我得到了预期的“混合”颜色。
Now I would like to get a figure with the two histograms; 现在,我想得到一个带有两个直方图的图形; first of all I draw the two histograms in two separate figures: with the code 首先,我在两个单独的图中绘制两个直方图:使用代码
ggplot(df_1, aes(measure,fill=processing))+geom_histogram(alpha=0.5)
I get the following figure 我得到下图
and with the code 并与代码
ggplot(df_2, aes(measure,fill=processing))+geom_histogram(alpha=0.5)
I get the following figure 我得到下图
both the two histograms are as expected. 两个直方图均符合预期。
The problem starts when I try to plot both the histogram in the same graph, with this code 当我尝试使用此代码在同一张图中绘制两个直方图时,问题就开始了
ggplot(df, aes(measure,fill=processing))+geom_histogram(alpha=0.5)
I get this figure 我得到这个数字
and I can't explain why the green histogram is higher than the red one. 而且我无法解释为什么绿色直方图高于红色直方图。 Furthermore, where the two histograms "overlap", I do not get a "blended" color. 此外,在两个直方图“重叠”的情况下,我没有得到“混合”的颜色。
Trying to add the frequency polygon worsens the problem, with this code 尝试添加频率多边形使此问题恶化
ggplot(df, aes(measure,fill=processing,colour=processing))+geom_freqpoly()+geom_histogram(alpha=0.5)
I get this figure 我得到这个数字
where the frequency polygons seems to me correct but the histograms are wrong like in the previous figure. 频率多边形在我看来似乎是正确的,但直方图是错误的,如上图所示。
What am I doing wrong? 我究竟做错了什么?
The output from version
is version
的输出是
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 0.2
year 2013
month 09
day 25
svn rev 63987
language R
version.string R version 3.0.2 (2013-09-25)
nickname Frisbee Sailing
The output from sessionInfo()
is sessionInfo()
的输出是
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] methods stats graphics grDevices utils datasets base
other attached packages:
[1] ggplot2_0.9.3.1
loaded via a namespace (and not attached):
[1] colorspace_1.2-4 dichromat_2.0-0 digest_0.6.4 grid_3.0.2
[5] gtable_0.1.2 labeling_0.2 MASS_7.3-29 munsell_0.4.2
[9] plyr_1.8 proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2
[13] scales_0.2.3 stringr_0.6.2
Use geom_histogram
with the argument position = "identity"
. 将geom_histogram
与参数position = "identity"
。 The default value for position
is "stack"
. position
的默认值为"stack"
。 In this case, the bars do not overlap but are stacked. 在这种情况下,条形不重叠而是堆叠在一起。
geom_histogram(alpha = 0.5, position = "identity")
The complete code: 完整的代码:
library(ggplot2)
ggplot(df, aes(measure, fill = processing)) +
geom_histogram(alpha = 0.5, position = "identity")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.