简体   繁体   English

如何在R中叠加密度图?

[英]How to overlay density plots in R?

I would like to overlay 2 density plots on the same device with R. How can I do that?我想用 R 在同一台设备上叠加 2 个密度图。我该怎么做? I searched the web but I didn't find any obvious solution.我在网上搜索,但没有找到任何明显的解决方案。

My idea would be to read data from a text file (columns) and then use我的想法是从文本文件(列)中读取数据,然后使用

plot(density(MyData$Column1))
plot(density(MyData$Column2), add=T)

Or something in this spirit.或者本着这种精神的东西。

use lines for the second one:使用第二lines

plot(density(MyData$Column1))
lines(density(MyData$Column2))

make sure the limits of the first plot are suitable, though.不过,请确保第一个图的限制是合适的。

ggplot2 is another graphics package that handles things like the range issue Gavin mentions in a pretty slick way. ggplot2是另一个图形包,它以一种非常巧妙的方式处理 Gavin 提到的范围问题。 It also handles auto generating appropriate legends and just generally has a more polished feel in my opinion out of the box with less manual manipulation.它还可以处理自动生成适当的图例,并且在我看来,开箱即用的感觉通常更精致,手动操作更少。

library(ggplot2)

#Sample data
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
                   , lines = rep(c("a", "b"), each = 100))
#Plot.
ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5)

在此处输入图片说明

Adding base graphics version that takes care of y-axis limits, add colors and works for any number of columns:添加处理 y 轴限制的基本图形版本,添加颜色并适用于任意数量的列:

If we have a data set:如果我们有一个数据集:

myData <- data.frame(std.nromal=rnorm(1000, m=0, sd=1),
                     wide.normal=rnorm(1000, m=0, sd=2),
                     exponent=rexp(1000, rate=1),
                     uniform=runif(1000, min=-3, max=3)
                     )

Then to plot the densities:然后绘制密度:

dens <- apply(myData, 2, density)

plot(NA, xlim=range(sapply(dens, "[", "x")), ylim=range(sapply(dens, "[", "y")))
mapply(lines, dens, col=1:length(dens))

legend("topright", legend=names(dens), fill=1:length(dens))

Which gives:这使:

在此处输入图片说明

Just to provide a complete set, here's a version of Chase's answer using lattice :只是为了提供一个完整的集合,这是 Chase 使用lattice的答案的一个版本:

dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
                   , lines = rep(c("a", "b"), each = 100))

densityplot(~dens,data=dat,groups = lines,
            plot.points = FALSE, ref = TRUE, 
            auto.key = list(space = "right"))

which produces a plot like this:这产生了这样的情节:在此处输入图片说明

That's how I do it in base (it's actually mentionned in the first answer comments but I'll show the full code here, including legend as I can not comment yet...)这就是我在 base 中的做法(实际上在第一个答案评论中提到过,但我将在此处显示完整代码,包括图例,因为我还不能评论......)

First you need to get the info on the max values for the y axis from the density plots.首先,您需要从密度图中获取 y 轴最大值的信息。 So you need to actually compute the densities separately first所以你需要首先分别计算密度

dta_A <- density(VarA, na.rm = TRUE)
dta_B <- density(VarB, na.rm = TRUE)

Then plot them according to the first answer and define min and max values for the y axis that you just got.然后根据第一个答案绘制它们并为您刚刚获得的 y 轴定义最小值和最大值。 (I set the min value to 0) (我将最小值设置为 0)

plot(dta_A, col = "blue", main = "2 densities on one plot"), 
     ylim = c(0, max(dta_A$y,dta_B$y)))  
lines(dta_B, col = "red")

Then add a legend to the top right corner然后在右上角添加图例

legend("topright", c("VarA","VarB"), lty = c(1,1), col = c("blue","red"))

I took the above lattice example and made a nifty function.我拿上面的格子例子做了一个漂亮的函数。 There is probably a better way to do this with reshape via melt/cast.可能有更好的方法通过熔化/铸造重塑。 (Comment or edit if you see an improvement.) (如果您看到改进,请发表评论或编辑。)

multi.density.plot=function(data,main=paste(names(data),collapse = ' vs '),...){
  ##combines multiple density plots together when given a list
  df=data.frame();
  for(n in names(data)){
    idf=data.frame(x=data[[n]],label=rep(n,length(data[[n]])))
    df=rbind(df,idf)
  }
  densityplot(~x,data=df,groups = label,plot.points = F, ref = T, auto.key = list(space = "right"),main=main,...)
}

Example usage:用法示例:

multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1),main='BN1 vs BN2')

multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1))

Whenever there are issues of mismatched axis limits, the right tool in base graphics is to use matplot .每当出现轴限制不匹配的问题时, base图形中的正确工具是使用matplot The key is to leverage the from and to arguments to density.default .关键是利用fromto参数来传递给density.default It's a bit hackish, but fairly straightforward to roll yourself:这有点hackish,但自己滚动相当简单:

set.seed(102349)
x1 = rnorm(1000, mean = 5, sd = 3)
x2 = rnorm(5000, mean = 2, sd = 8)

xrng = range(x1, x2)

#force the x values at which density is
#  evaluated to be the same between 'density'
#  calls by specifying 'from' and 'to'
#  (and possibly 'n', if you'd like)
kde1 = density(x1, from = xrng[1L], to = xrng[2L])
kde2 = density(x2, from = xrng[1L], to = xrng[2L])

matplot(kde1$x, cbind(kde1$y, kde2$y))

描绘调用 matplot 的输出的图。观察到两条曲线,一条红色,另一条黑色;黑色曲线比红色曲线延伸得更高,而红色曲线则“更胖”。

Add bells and whistles as desired ( matplot accepts all the standard plot / par arguments, eg lty , type , col , lwd , ...).根据需要添加花里胡哨( matplot接受所有标准的plot / par参数,例如ltytypecollwd ,...)。

You can use the ggjoy package.您可以使用ggjoy包。 Let's say that we have three different beta distributions such as:假设我们有三个不同的beta分布,例如:

set.seed(5)
b1<-data.frame(Variant= "Variant 1", Values = rbeta(1000, 101, 1001))
b2<-data.frame(Variant= "Variant 2", Values = rbeta(1000, 111, 1011))
b3<-data.frame(Variant= "Variant 3", Values = rbeta(1000, 11, 101))


df<-rbind(b1,b2,b3)

You can get the three different distributions as follows:您可以获得三种不同的分布,如下所示:

library(tidyverse)
library(ggjoy)


ggplot(df, aes(x=Values, y=Variant))+
    geom_joy(scale = 2, alpha=0.5) +
    scale_y_discrete(expand=c(0.01, 0)) +
    scale_x_continuous(expand=c(0.01, 0)) +
    theme_joy()

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM