简体   繁体   English

R-使用ggplot2在林图中的多个数据点

[英]R - Multiple data points in forest plot using ggplot2

Example data: 示例数据:

df <- data.frame(Mean1=c(12,15,17,14,16,18,16,14),Lower1=c(8,11,13,7,15,12,12,11),Upper1=c(16,18,21,21,17,24,20,17),Mean2=c(13,16,18,15,17,19,17,15),Lower2=c(9,12,14,8,16,13,13,12),Upper2=c(17,19,22,22,18,25,21,18))
rownames(df) <- c(1,2,3,4,5,6,7,8)

I can produce a forest plot with Mean1 Lower1 and Upper1 from df : 我可以用df Mean1 Lower1Upper1生成一个森林图:

ggplot(df, aes(y = row.names(df), x = df$Mean1)) +
     geom_point(size = 4) +
     geom_errorbarh(aes(xmax = df$Upper1, xmin = df$Lower1))

So my question is: How can I include Mean2 Lower2 and Upper2 from df to the plot so that both means from each observation point (rows) are represented as pairs with their respective error bars? 所以我的问题是:我怎么能包括Mean2 Lower2Upper2df的情节,这样无论是从每个观察点是指(行)表示为各自的误差棒对? So the output would be a similar forest plot, but with both means and error limits from each observation points displayed in pairs. 因此,输出将是类似的森林图,但每个观察点的均值和误差极限均成对显示。 I hope this makes sense. 我希望这是有道理的。

I haven't tried anything because I simply don't know where to start. 我什么也没尝试,因为我根本不知道从哪里开始。

I this possible to perform without disrupting the structure of the data frame? 我可以执行而不会破坏数据帧的结构吗?

The most natural way to do it is to use position argument, but it needs values grouped with variable, not column names. 最自然的方法是使用position参数,但是它需要按变量而不是列名分组的值。 You can add it inplace: 您可以将其添加到位:

ggplot(df,aes(x= rep(rownames(df), 2),
       y= c(Mean1,Mean2),
       group=rep(c(1,2), each=nrow(df)))) +
geom_point(position=position_dodge(1))+coord_flip()

But more proper way is to disrupt the structure of the data frame, it will make code more cleaner: 但是更合适的方法是破坏数据帧的结构,这将使代码更简洁:

ggplot(df, aes(x = rownames, 
           y = Mean, 
           group=groups)) +
geom_point(size = 4, position=position_dodge(1))+
geom_errorbar(aes(ymax = Upper, ymin = Lower), position=position_dodge(1))+
coord_flip()

For this example I've made this data.frame transformation: 对于此示例,我进行了此data.frame转换:

df <- data.frame(Mean=c(df$Mean1,df$Mean2),
                 Lower=c(df$Lower1,df$Lower2),
                 Upper=c(df$Upper1,df$Upper2),
                 groups=factor(rep(c(1,2), each=nrow(df))),
                 rownames=as.character(rep(rownames(df), 2)))

I am not sure what you mean but do you want to plot the Mean2 values on top of the forest plot? 我不确定您的意思,但是是否要在森林图的顶部绘制Mean2值? In that case you can assign the first plot a value, lets say s1 and then add the new data to it like this (maybe add diff colors): 在这种情况下,您可以为第一个图分配一个值,假设为s1,然后像这样向其添加新数据(也许添加差异颜色):

s1<-ggplot(df, aes(y = row.names(df), x = df$Mean1)) +
     geom_point(size = 4) +
     geom_errorbarh(aes(xmax = df$Upper1, xmin = df$Lower1))

s1 + geom_point(data=df, aes(y = row.names(df), x = df$Mean2)) + 
  geom_errorbarh(aes(xmax = df$Upper2, xmin = df$Lower2))

Otherwise you can restructure the data and then add facet_grid(. ~ Sample) to make seperate graphs for your samples (Mean1 and Mean2) 否则,您可以重组数据,然后添加facet_grid(。〜Sample)为样本(Mean1和Mean2)制作单独的图形

I don't know how to do it without disrupting the structure of your data frame, but since your data frame is not tidy data I would recommend to change it anyway. 我不知道如何在不破坏数据框结构的情况下做到这一点,但是由于您的数据框不是整洁的数据,因此我建议还是进行更改。 Then I get the following that might answer your question: 然后,我得到以下内容可能会回答您的问题:

library(tidyr)
df$itemid <- rownames(df)
df <- gather(df, type, value, -itemid)
df <- separate(df, type, into=c("type", "grpid"), sep=-2)
df <- spread(df, type, value)

done in separate steps so it is easier to execute step by step to see what is happening. 分步完成,因此更容易逐步执行以查看正在发生的情况。 Then you can plot using: 然后,您可以使用以下方法进行绘制:

library(ggplot2)
ggplot(df, aes(y = paste(itemid, grpid), x = df$Mean, color = grpid)) +
     geom_point(size = 4) +
     geom_errorbarh(aes(xmax = df$Upper, xmin = df$Lower))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM