简体   繁体   English

ggplot2:x轴上的多个变量多次

[英]ggplot2: multiple variables on x-axis at multiple times

I have a data frame for observation numbers (3 observations for same id), height, weight and fev that looks like this (just for example): 我有一个数据框,用于观察编号(3个观察值具有相同的ID),高度,重量和FEV,如下所示(例如):

id      obs     height  weight       fev
1         1        160     80         90
1         2        150     70         85
1         3        155     76         87
2         1        140     67         91
2         2        189     78         71
2         3        178     86         89

I need to plot this data using ggplot2 such that on x-axis there are 3 variables height, weight, fev; 我需要使用ggplot2绘制此数据,以便在x轴上有3个变量,分别是高度,重量,fev。 and the observation numbers are displayed as 3 vertical lines for each variable (color coded), where each lines show a median as a solid circle, and 25th and 75th percentiles as caps at the upper and lower extremes of the line (no minimum or maximum needed). 每个变量(颜色编码)的观察号显示为3条垂直线,其中每条线将中位数显示为实心圆,并在行的上下两端(第25个和第75个百分位数)作为上限(无最小值或最大值)需要)。 I have so far tried many variations of box plots but I am not even getting close. 到目前为止,我已经尝试了许多箱形图的变体,但是我什至没有接近。 Any suggestion(s) how to approach or solve this? 有什么建议可以解决或解决这个问题吗?

Thanks 谢谢

Try putting the data into long format prior to graphing. 尝试在绘制图形之前将数据转换为长格式。 I generated some more data, 12 subjects, each with 3 observations. 我生成了更多数据,包括12个主题,每个主题都有3个观察结果。

id <- rep(1:12, each = 3)
obs <- rep(1:3, 12)
height <- seq(140,189, length.out =  36)
weight <- seq(67,86, length.out = 36)
fev <- seq(71,91, length.out = 36)

df <- as.data.frame(cbind(id,obs,height, weight, fev))


library(reshape2) #use to melt data from wide to long format

longdf <- melt(df,id.vars = c('id', 'obs')) 

Don't need to define measure variables here since the id.vars are defined, the remaining non-id.vars automatically default to measure variables. 由于定义了id.vars,因此无需在此处定义度量变量,其余的非id.vars自动默认为度量变量。 If you have more variables in your data set, you'll want to define measure variables in that same line as: measure.vars = c("height,"weight","fev") 如果数据集中有更多变量,则需要在同一行中定义度量变量,如下所示:measure.vars = c(“ height,” weight“,” fev“)

longdf <- melt(df,id.vars = c('id', 'obs'), measure.vars = c("height", "weight", "fev"))

Apologies, haven't earned enough votes to put figures into my responses 不好意思,我没有赢得足够的选票,无法在我的回应中加入数字

ggplot(data = longdf, aes(x = variable, y = value, fill = factor(obs))) + 
geom_boxplot(notch = T, notchwidth = .25, width = .25, position = position_dodge(.5))

This does not produce the exact graph you described-- which sounded like it was geom_linerange or something similar? 不会产生你described--确切的图形听起来就像是geom_linerange或类似的东西? -- those geoms require an x, ymin, and ymax to draw. -这些几何需要绘制x,ymin和ymax。 Otherwise a regular, 'ole boxplot has your 1st and 3rd IQRs and median marked. 否则,常规的ole箱形图会标出您的第一和第三IQR和中位数。 I adjusted parameters of the boxplot to make it thinner with notches and widths, and separated them slightly with the position_dodge(.5) 我调整了箱线图的参数以使其更窄,更窄,更窄,并使用position_dodge(.5)对其稍作分隔

after reading your response, I edited my original answer 阅读您的回复后,我修改了原始答案

You could try facet_wrap -- and watch the exchanging of "fill" vs. "color" in ggplot. 您可以尝试facet_wrap ,并在ggplot中观看“填充”与“颜色”的交换。 If an object can't be "filled" with a color, like a boxplot or distribution, then it has to be "colored" with a color. 如果对象不能像盒图或分布图那样用颜色“填充”,那么就必须用颜色“着色”。 Use color instead in the original aes() 在原始aes()中使用颜色代替

ggplot(data = longdf, aes(x = variable, y = value, color = factor(obs))) + 
stat_summary(fun.data=median_hilow) + facet_wrap(.~obs)

This gives you observation 1 - height, weight, fev side by side, observation 2- height, .... 这为您提供观察1-身高,体重,肩并肩,观察2-身高....

If that still isn't what you want perhaps more like height observation 1,2,3; 如果那仍然不是您想要的,那么可能更像是高度观测1,2,3; weight observation 1,2,3...then you'll need to modify your melting to have two variable and two value columns. 重量观察1,2,3 ...然后您需要修改融化以具有两个变量和两个值列。 Essentially make two melted dataframes, then cbind. 本质上是制作两个融化的数据帧,然后绑定。 Annnnd because each observation has three variables, you'll need to rbind to make sure both data frames have the same number of rows: Annnnd因为每个观察值都有三个变量,所以您需要rbind以确保两个数据框具有相同的行数:

 obsonly <- melt(df, id.vars = c('id'), measure.vars = 'obs')

 obsonly <- rbind(obsonly,obsonly,obsonly) #making rows equal 

 longvars <- melt(df[-2],id.vars = 'id') #dropping obs from melt

 longdf2 <- cbind(obsonly,longvars)

 longdf2 <- longdf2[-4] #dropping second id column


 colnames(longdf2)[c(2:5)] <- c('obs', 'obsnum', 'variable', 'value')

 ggplot(data = longdf2, aes(x = obsnum, y = value, 
         color = factor(variable))) + 
         stat_summary(fun.data=median_hilow) +
         facet_wrap(.~variable)

From here you can play around with the x axis marks (probably isn't useful to have a 1.5 observation marked) and the spacing of the lines from each other 从这里开始,您可以使用x轴标记(标记为1.5的观察点可能没有用)以及行与行之间的间距

OK instead what I did below was make three graphs then piece together with gridExtra. 好吧,相反,我在下面所做的是制作三个图形,然后将它们与gridExtra组合在一起。 Read more about package here: http://www.sthda.com/english/wiki/wiki.php?id_contents=7930 在此处阅读有关软件包的更多信息: http : //www.sthda.com/english/wiki/wiki.php?id_contents=7930

I took the common legend code from this site to produce the following, starting with our existing longdf2. 我从这个站点上获取了常见的图例代码,从我们现有的longdf2开始,生成了以下代码。 By piecing together the graphs, the information about corresponding observation is within the title of the graph 通过将图拼凑在一起,有关相应观察的信息就在图的标题内

id <- rep(1:12, each = 3)
obs <- rep(1:3, 12)
height <- seq(140,189, length.out =  36)
weight <- seq(67,86, length.out = 36)
fev <- seq(71,91, length.out = 36)

df <- as.data.frame(cbind(id,obs,height, weight, fev))

obsonly <- melt(df, id.vars = c('id'), measure.vars = 'obs')

obsonly <- rbind(obsonly,obsonly,obsonly)

newvars <- melt(df[-2],id.vars = 'id')

longdf2 <- cbind(obsonly,newvars)

longdf2 <- longdf2[-4] #dropping second id column

colnames(longdf2)[c(2:5)] <- c('obs', 'obsnum', 'variable', 'value')

#Make graph 1 of observation 1

g1 <- longdf2 %>%
  dplyr::filter(obsnum == 1) %>%
  ggplot(aes(x = variable, y = value, color = variable)) + 
    stat_summary(fun.data=median_hilow) +
      labs(title = "Observation 1") +
       theme(plot.title = element_text(hjust = 0.5)) #has a legend

g2 <- longdf2 %>%
dplyr::filter(obsnum == 2) %>%
ggplot(aes(x = variable, y = value, color = variable)) + 
  stat_summary(fun.data=median_hilow) +
    labs(title = "Observation 2") +
     theme(plot.title = element_text(hjust = 0.5), legend.position = 
        'none')
    #specified as none to make common legend at end

g3 <- longdf2 %>%
   dplyr::filter(obsnum == 3) %>%
   ggplot(aes(x = variable, y = value, color = variable)) + 
     stat_summary(fun.data=median_hilow) +
      labs(title = "Observation 3") +
      theme(plot.title = element_text(hjust = 0.5), legend.position = 
      'none')


library(gridExtra)
get_legend<-function(myggplot){
 tmp <- ggplot_gtable(ggplot_build(myggplot))
 leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
 legend <- tmp$grobs[[leg]]
 return(legend)
    }


# Save legend

legend <- get_legend(g1)


# Remove legend from 1st graph

g1 <- g1 + theme(legend.position = 'none')

# Combine graphs

grid.arrange(g1, g2, g3, legend, ncol=4, widths=c(2.3, 2.3, 2.3, 0.8))

Plenty of other little tweaks you could make along the way 您可以在此过程中进行许多其他小调整

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM