简体   繁体   English

躲避ggplot2中的列

[英]dodge columns in ggplot2

I am trying to create a picture that summarises my data. 我正在尝试创建一张汇总我的数据的图片。 Data is about prevalence of drug use obtained from different practices form different countries. 有关来自不同国家的不同实践获得的毒品使用率的数据。 Each practice has contributed with a different amount of data and I want to show all of this in my picture. 每种做法都贡献了不同数量的数据,我希望在我的图片中显示所有这些信息。

Here is a subset of the data to work on: 这是要处理的数据的子集:

gr<-data.frame(matrix(0,36))
gr$drug<-c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b")
gr$practice<-c("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r")
gr$country<-c("c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2","c3","c3","c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2","c3","c3")
gr$prevalence<-c(9.14,5.53,16.74,1.93,8.51,14.96,18.90,11.18,15.00,20.10,24.56,22.29,19.41,20.25,25.01,25.87,29.33,20.76,18.94,24.60,26.51,13.37,23.84,21.82,23.69,20.56,30.53,16.66,28.71,23.83,21.16,24.66,26.42,27.38,32.46,25.34)
gr$prop<-c(0.027,0.023,0.002,0.500,0.011,0.185,0.097,0.067,0.066,0.023,0.433,0.117,0.053,0.199,0.098,0.100,0.594,0.406,0.027,0.023,0.002,0.500,0.011,0.185,0.097,0.067,0.066,0.023,0.433,0.117,0.053,0.199,0.098,0.100,0.594,0.406)
gr$low.CI<-c(8.27,4.80,12.35,1.83,7.22,14.53,18.25,10.56,14.28,18.76,24.25,21.72,18.62,19.83,24.36,25.22,28.80,20.20,17.73,23.15,21.06,13.12,21.79,21.32,22.99,19.76,29.60,15.41,28.39,23.25,20.34,24.20,25.76,26.72,31.92,24.73)
gr$high.CI<-c(10.10,6.37,22.31,2.04,10.00,15.40,19.56,11.83,15.74,21.52,24.87,22.86,20.23,20.68,25.67,26.53,29.86,21.34,20.21,26.10,32.79,13.63,26.02,22.33,24.41,21.39,31.48,17.98,29.04,24.43,22.01,25.12,27.09,28.05,33.01,25.95)

The code I wrote is this 我写的代码是这样

p<-ggplot(data=gr, aes(x=factor(drug), y=as.numeric(gr$prevalence), ymax=max(high.CI),position="dodge",fill=practice,width=prop))
colour<-c(rep("gray79",10),rep("gray60",6),rep("gray39",2))
p + theme_bw()+
  geom_bar(stat="identity",position = position_dodge(0.9)) +
  labs(x="Drug",y="Prevalence") + 
  geom_errorbar(ymax=gr$high.CI,ymin=gr$low.CI,position=position_dodge(0.9),width=0.25,size=0.25,colour="black",aes(x=factor(drug), y=as.numeric(gr$prevalence), fill=practice)) +
  ggtitle("Drug usage by country and practice") +
  scale_fill_manual(values = colour)+ guides(fill=F)

The figure I obtain is this one where bars are all on top of each other while I want them "dodge". 我得到的图是这样一个图,其中的条形图相互重叠,而我希望它们“躲闪”。

在此处输入图片说明

I also obtain the following warning: 我还收到以下警告:

ymax not defined: adjusting position using y instead Warning message: position_dodge requires non-overlapping x intervals 未定义ymax:改为使用y来调整位置警告消息:position_dodge需要不重叠的x间隔

Ideally I would get each bar near one another, with their error bars in the middle of its bar, all organised by country. 理想情况下,我会使每个小节彼此接近,并且它们的错误小节位于其小节的中间,并按国家/地区进行组织。

Also should I be concerned about the warning (which I clearly do not fully understand)? 我还应该担心警告(我显然不完全理解该警告)吗?

I hope this makes sense. 我希望这是有道理的。 I hope I am close enough, but I don't seem to be going anywhere, some help would be greatly appreciated. 我希望我足够亲近,但是我似乎什么都不会走,因此,不胜感激。

Thank you 谢谢

ggplot's geom_bar() accepts the width parameter, but doesn't line them up neatly against one another in dodged position by default. ggplot的geom_bar()接受width参数,但是默认情况下,它们不会以躲避的位置整齐地排列在一起。 The following workaround references the solution here : 以下变通办法在此处引用了解决方案:

library(dplyr)

# calculate x-axis position for bars of varying width
gr <- gr %>%
  group_by(drug) %>%
  arrange(practice) %>%
  mutate(pos = 0.5 * (cumsum(prop) + cumsum(c(0, prop[-length(prop)])))) %>%
  ungroup()

x.labels <- gr$practice[gr$drug == "a"]
x.pos <- gr$pos[gr$drug == "a"]

ggplot(gr,
       aes(x = pos, y = prevalence, 
           fill = country, width = prop,
           ymin = low.CI, ymax = high.CI)) +
  geom_col(col = "black") +
  geom_errorbar(size = 0.25, colour = "black") +
  facet_wrap(~drug) +
  scale_fill_manual(values = c("c1" = "gray79",
                               "c2" = "gray60",
                               "c3" = "gray39"),
                    guide = F) +
  scale_x_continuous(name = "Drug",
                     labels = x.labels,
                     breaks = x.pos) +
  labs(title = "Drug usage by country and practice", y = "Prevalence") +
  theme_classic()

情节

There is a lot of information you are trying to convey here - to contrast drug A and drug B across countries using the barplots and accounting for proportions, you might use the facet_grid function. 您要在此处传达很多信息-为了使用国家/地区对毒品A和毒品B进行对比,并计算比例,您可以使用facet_grid函数。 Try this: 尝试这个:

      colour<-c(rep("gray79",10),rep("gray60",6),rep("gray39",2))




      gr$drug <- paste("Drug", gr$drug)
      p<-ggplot(data=gr, aes(x=factor(practice), y=as.numeric(prevalence), 
                             ymax=high.CI,ymin = low.CI, 
                             position="dodge",fill=practice, width=prop))


        p + theme_bw()+ facet_grid(drug~country, scales="free")  +
        geom_bar(stat="identity") +
        labs(x="Practice",y="Prevalence") + 
        geom_errorbar(position=position_dodge(0.9), width=0.25,size=0.25,colour="black") +
        ggtitle("Drug usage by country and practice") +
        scale_fill_manual(values = colour)+ guides(fill=F)

在此处输入图片说明

The width is too small in the C1 country and as you indicated the one clinic is quite influential. 在C1国家/地区,宽度太小,正如您所指出的,一家诊所很有影响力。

Also, you can specify your aesthetics with the ggplot(aes(...)) and not have to reset it and it is not needed to include the dataframe objects name in the aes function within the ggplot call. 另外,您可以使用ggplot(aes(...))来指定外观,而不必重置它,并且不需要在ggplot调用的aes函数中包括数据框对象名称。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM