简体   繁体   English

问题与差异变量重叠的 ggplot2 直方图

[英]Issue overlapping ggplot2 histograms with diff variables

I have a dataframe (snapshot below - it actually has over 12,000 observations) where I've been able to use ggplot and construct a histogram to depict the distribution of each variable on it's own.我有一个 dataframe(下面的快照 - 它实际上有超过 12,000 个观察值),我已经能够使用 ggplot 并构建一个直方图来单独描述每个变量的分布。 I am now interested in super-imposing the 3 histograms I've constructed on top of one another to create a 'single' histogram.我现在有兴趣将我构建的 3 个直方图叠加在一起,以创建一个“单一”直方图。

在此处输入图像描述

I've tried to adapt the suggestion below from a similar question that was asked previously ,我试图从之前提出的类似问题中调整以下建议,

d = data.frame(x = c(data1, data2), 
               type=rep(c("A", "B"), c(length(data1), length(data2))))
ggplot(d) + 
  geom_density(aes(x=x, colour=type))

but I keep getting the following error:但我不断收到以下错误:

#combinedImputedDataVis is a data frame
distr <- combinedImputedDataVis[,37:39]
ggplot(distr) + geom_density(aes(x=c(rt,controlt,cleart),
                                 type=rep(c("rt","controlt","cleart"),c(length(rt),length(controlt),length(cleart)))))

Error: Aesthetics must be either length 1 or the same as the data (12687): x and type
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Ignoring unknown aesthetics: type 

Not sure where I am going wrong, would appreciate a second set of eyes不知道我哪里出错了,希望有第二双眼睛

Using your original data frame as shown in the image in your question:使用问题中的图像所示的原始数据框:

d <- structure(list(rt = c(15, 173, 66, 167, 341), controlt = c(1294, 
181, 145, 835, 675), cleart = c(1603, 3274, 722, 1059, 2468)), 
class = "data.frame", row.names = c(NA, -5L))

d
#>    rt controlt cleart
#> 1  15     1294   1603
#> 2 173      181   3274
#> 3  66      145    722
#> 4 167      835   1059
#> 5 341      675   2468

Then if the data is in this format, it would be easiest to do a separate geom for each column:然后,如果数据是这种格式,最简单的方法是为每一列做一个单独的 geom:

ggplot(d) + 
  geom_density(aes(x = rt), fill = "red", alpha = 0.5) + 
  geom_density(aes(x = controlt), fill = "green", alpha = 0.5) + 
  geom_density(aes(x = cleart), fill = "blue", alpha = 0.5)

在此处输入图像描述

But it would be better to reshape your data to get all your variable names into one column and all their values in another.但最好重塑您的数据,将所有变量名放入一列,并将它们的所有值放入另一列。 That way it is much easier to control various aspects of the plot:这样就更容易控制 plot 的各个方面:

library(dplyr)

d %>% 
  pivot_longer(everything()) %>%
  ggplot(aes(x = value, fill = name)) + 
  geom_density(alpha = 0.5)

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM