[英]How to plot density curves for each column in R?
I have a data frame w
like this:我有一个这样的数据w
:
>head(w,3)
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 0.2446884 0.3173719 0.74258410 0.0000000 0 0.0000000 0.01962759 0.0000000 0.0000000 0.5995647 0 0.30201691 0.03109935 0.16897571
2 0.0000000 0.0000000 0.08592243 0.2254971 0 0.7381867 0.11936323 0.2076167 0.0000000 1.0587742 0 0.50226734 0.51295661 0.01298853
3 8.4293893 4.9985040 2.22526463 0.0000000 0 3.6600283 0.00000000 0.0000000 0.2573714 0.8069288 0 0.05074886 0.00000000 0.59403855
V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27
1 0.00000000 0.0000000 0.000000 0.1250837 0.000000 0.5468143 0.3503245 0.000000 0.183144204 0.23026538 6.9868429 1.5774150 0.0000000
2 0.01732732 0.8064441 0.000000 0.0000000 0.000000 0.0000000 0.0000000 0.000000 0.015123385 0.07580794 0.6160713 0.7452335 0.0740328
3 2.66846151 0.0000000 1.453987 0.0000000 1.875298 0.0000000 0.0000000 0.893363 0.004249061 0.00000000 1.6185897 0.0000000 0.7792773
V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 refseq
1 0.5543028 0 0.00000 0.0000000 0.08293075 0.18261450 0.3211127 0.2765295 0 0.04230929 0.05017316 0.3340662 0.00000000 NM_000014
2 0.0000000 0 0.00000 0.0000000 0.00000000 0.03531411 0.0000000 0.4143325 0 0.14894716 0.58056304 0.3310173 0.09162460 NM_000015
3 0.8047882 0 0.88308 0.7207709 0.01574767 0.00000000 0.0000000 0.1183736 0 0.00000000 0.00000000 1.3529881 0.03720155 NM_000016
dim(w)
[1] 37126 41
I tried to plot the density curve of each column(except the last column) in one page.我试过 plot 一页中每一列(最后一列除外)的密度曲线。 It seems that ggplot2 can do this.好像ggplot2可以做到这一点。
I tried this according to this post :我根据这篇文章尝试了这个:
ggplot(data=w[,-41], aes_string(x=colnames)) + geom_density()
But it doesn't work by complaining like this:但是这样抱怨是行不通的:
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'
And I'm not sure how to convert the format of this dataframe to the one ggplot2 accepts.而且我不确定如何将此 dataframe 的格式转换为 ggplot2 接受的格式。 Or is there other way to do this job in R?或者有其他方法可以在 R 中完成这项工作吗?
ggplot
needs your data in a long format, like so: ggplot
需要长格式的数据,如下所示:
variable value
1 V1 0.24468840
2 V1 0.00000000
3 V1 8.42938930
4 V2 0.31737190
Once it's melted into a long data frame, you can group all the density plots by variable. 一旦融入长数据框,您就可以按变量对所有密度图进行分组。 In the snippet below, ggplot
uses the w.plot
data frame for plotting (which doesn't need to omit the final refseq
variable). 在下面的代码片段中, ggplot
使用w.plot
数据框进行绘图(不需要省略最终的refseq
变量)。 You can modify it to use facets, different colors, fills, etc. 您可以将其修改为使用构面,不同颜色,填充等。
w <- as.data.frame(cbind(
c(0.2446884, 0.0000000, 8.4293893),
c(0.3173719, 0.0000000, 4.9985040),
c(0.74258410, 0.08592243, 2.22526463)))
w$refseq <- c("NM_000014", "NM_000015", "NM_000016")
library(ggplot2)
library(reshape2)
w.plot <- melt(w)
p <- ggplot(aes(x=value, colour=variable), data=w.plot)
p + geom_density()
Use "melt" from the "reshape" package (you could also use the base reshape function, but it's a more complicated call). 使用“reshape”包中的“melt”(你也可以使用base reshape函数,但这是一个更复杂的调用)。
require (reshape)
require (ggplot2)
long = melt(w, id.vars= "refseq")
ggplot(long, aes (value)) +
geom_density(color = variable)
# or maybe you wanted separate plots on the same page?
ggplot(long, aes (value)) +
geom_density() +
facet_wrap(~variable)
There are lots of other ways to plot this in ggplot: see http://docs.ggplot2.org/0.9.3.1/geom_histogram.html for examples. 在ggplot中有很多其他方法可以绘制这个:请参阅http://docs.ggplot2.org/0.9.3.1/geom_histogram.html以获取示例。
Here's a solution using the plot
function and a little loop 这是一个使用plot
函数和一个小循环的解决方案
Call your plot 打电话给你的情节
plot(density(df[,1]), type = "n")
then run this to add the lines 然后运行它来添加行
n = dim(df)[2]-1
for(i in 1:n){
lines(density(c(df[,i])))
}
This will make a 8 x 5 grid of the density plots with multiple lines on each plot coloured by the variable refseq...这将制作一个 8 x 5 的密度图网格,每个 plot 上有多条线,由变量 refseq 着色......
library(tidyverse)
w_density <- w[,1:40] # columns you want densities for
w_density$refseq <- w$refseq # maybe you have a variable to group by
w_density %>%
pivot_longer(!refseq, names_to = "variable", values_to = "value") %>%
ggplot(aes(x = value, colour = refseq)) +
geom_density(show.legend = TRUE) +
facet_wrap(~variable, scales = "free", ncol = 5) +
ggtitle("Title goes here")
If the grid is not the right size and you're using Rmd then you can play with the chunk sizes...如果网格大小不正确并且您正在使用 Rmd 那么您可以使用块大小......
```{r, fig.height=20, fig.width=11}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.