简体   繁体   English

与R ggplot2组合的条形图:躲避并堆叠

[英]combined barplots with R ggplot2: dodged and stacked

I have a table of data which already contain several values to be plotted on a barplot with ggplot2 package (already cumulative data). 我有一个数据表,该数据表已包含要在带有ggplot2程序包的条形图上绘制的几个值(已经有累积数据)。

The data in the data frame "reserves" has the form (simplified): 数据帧“保留”中的数据具有(简化)形式:

period,amount,a1,a2,b1,b2,h1,h2,h3,h4
J,18.1,30,60,40,60,15,50,30,5
K,29,65,35,75,25,5,50,40,5
P,13.3,94,6,85,15,10,55,20,15
N,21.6,95,5,80,20,10,55,20,15

The first column (period) is the geological epoch. 第一列(时间段)是地质时期。 It will be on x axis, and I needed to have no extra ordering on it, so I prepared appropriate factor labelling with the command 它将在x轴上,并且我不需要对其进行任何额外的排序,因此我使用以下命令准备了适当的因子标签

reserves$period <- factor(reserves$period, levels = reserves$period)

The column "amount" is the main column to be plotted as y axis (it is percentage of hydrocarbons in each epoch, but it could be in absolute values as well, say, millions of tons or whatever). 列“数量”是要绘制为y轴的主要列(它是每个时期中碳氢化合物的百分比,但是也可以是绝对值,例如数百万吨或其他)。 So basic plot is invoked by the command: 因此,基本绘图由命令调用:

ggplot(reserves,aes(x=period,y=amount)) + geom_bar(stat="identity")

But here is the question. 但这是问题。 I need to plot other values, that is a1-a2, b1-b2, and h1-h4 on the same bar graph. 我需要在同一条形图上绘制其他值,即a1-a2,b1-b2和h1-h4。 These values are percentage values for each letter (for example, a1=60, then a2=40; the same for b1-b2; and for h1-h4 as well they sum up to 100. So: I need to have values a1-a2 as some color, proportionally dividing the "amount" bar for each value of x (stacked barplot), then I need the same for values b1-b2; so we have for each period two adjacent columns (grouped barplots), each of them is stacked. And next, I need the third column, for values h1-h4, perhaps, also as a stacked barplot, but either as a third column, or as a staggered barplot above the first one. 这些值是每个字母的百分比值(例如a1 = 60,然后a2 = 40;对于b1-b2相同;对于h1-h4也是如此,它们的总和为100。因此:我需要具有a1- a2为某种颜色,按比例将x的“数量”条划分为x(堆叠的条形图),那么对于值b1-b2,我需要相同的值;因此,对于每个周期,我们有两个相邻的列(分组的条形图),每一个接下来,我需要第三列,对于值h1-h4,也许还需要作为堆叠的barplot,但是要么作为第三列,要么作为第一列上方的交错barplot。

So the layout looks like this: 所以布局看起来像这样:

组合条形图的布局

I learned that I need first to reshape data with package reshape2, and then use the option position="dodge" or position="fill" in geom_bar(), but here is the combination thereof. 我了解到,我需要首先使用包reshape2来重塑数据,然后在geom_bar()中使用选项position =“ dodge”或position =“ fill”,但这是它们的组合。 And the third barplot (for values h1-h4) seems to need "stacked percent" representation with fixed height. 第三个小节(对于值h1-h4)似乎需要固定高度的“堆积百分比”表示。

Are there packages which handle the data for plotting in a more intuitive way? 是否有可以更直观地处理数据以进行绘制的软件包? Lets say, we just declare, that we want variables ai,bi, hi to be plotted. 可以说,我们只声明要绘制变量ai,bi,hi。

First you should reshape your data from wide to long, then scale your proportions to their raw values. 首先,您应该将数据从宽到长整形,然后将比例调整为原始值。 Then split your old column names (now levels of "lett") into their letters and numbers for labeling. 然后将您的旧列名称(现在为“字母”级别)分成字母和数字以进行标记。 If your real data aren't formatted like this (a1...h4) there's ways to handle that as well. 如果您的真实数据的格式不是这样(a1 ... h4),也可以使用这种方法。

library(dplyr)
library(tidyr)
library(ggplot2)

reserves <- read.csv(text = "period,amount,a1,a2,b1,b2,h1,h2,h3,h4
J,18.1,30,60,40,60,15,50,30,5
K,29,65,35,75,25,5,50,40,5
P,13.3,94,6,85,15,10,55,20,15
N,21.6,95,5,80,20,10,55,20,15") 

reserves.tidied <- reserves %>% 
  gather(key = lett, value = prop, -period, -amount) %>% 
  mutate(rawvalue = prop * amount/100,
         lett1 = substr(lett, 1, 1),
         num = substr(lett, 2, 2)) 

reserves.tidied
  period amount lett prop rawvalue lett1 num 1 J 18.1 a1 30 5.430 a 1 2 K 29.0 a1 65 18.850 a 1 3 P 13.3 a1 94 12.502 a 1 4 N 21.6 a1 95 20.520 a 1 5 J 18.1 a2 60 10.860 a 2 6 K 29.0 a2 35 10.150 a 2 7 P 13.3 a2 6 0.798 a 2 8 N 21.6 a2 5 1.080 a 2 9 J 18.1 b1 40 7.240 b 1 10 K 29.0 b1 75 21.750 b 1 11 P 13.3 b1 85 11.305 b 1 12 N 21.6 b1 80 17.280 b 1 13 J 18.1 b2 60 10.860 b 2 14 K 29.0 b2 25 7.250 b 2 15 P 13.3 b2 15 1.995 b 2 16 N 21.6 b2 20 4.320 b 2 17 J 18.1 h1 15 2.715 h 1 18 K 29.0 h1 5 1.450 h 1 19 P 13.3 h1 10 1.330 h 1 20 N 21.6 h1 10 2.160 h 1 21 J 18.1 h2 50 9.050 h 2 22 K 29.0 h2 50 14.500 h 2 23 P 13.3 h2 55 7.315 h 2 24 N 21.6 h2 55 11.880 h 2 25 J 18.1 h3 30 5.430 h 3 26 K 29.0 h3 40 11.600 h 3 27 P 13.3 h3 20 2.660 h 3 28 N 21.6 h3 20 4.320 h 3 29 J 18.1 h4 5 0.905 h 4 30 K 29.0 h4 5 1.450 h 4 31 P 13.3 h4 15 1.995 h 4 32 N 21.6 h4 15 3.240 h 4 

Then to plot your tidied data, you want the letters across the x axis, and the rawvalue we just calculated with amount*proportion on the y axis. 然后,要绘制整理后的数据,您需要使字母跨过x轴,而我们刚计算出的原始值在y轴上具有amount * proportion。 We stack the geom_col up from 1 to 2 or 1 to 4 (the reverse=T argument overrides the default, which would have 2 or 4 at the bottom of the stack). 我们将geom_col从1堆叠到2或从1堆叠到4( reverse=T参数将覆盖默认值,该默认值在堆栈底部将为2或4)。 alpha and fill let us distinguish between groups in the same bar and between bars. alphafill让我们区分同一条形图中的组和条形之间。

Then the geom_text labels each stacked segment with the name, a newline, and the original percentage, centered on each segment. 然后, geom_text将每个堆叠的段名称,换行符和原始百分比标记在每个段的中心。 The scale reverses the default behavior again, making 1 the darkest and 2 or 4 the lightest in each bar. scale再次反转默认行为,使每个条形中的1最暗,2或4最亮。 Then you facet across, making one group of bars for each period. 然后你facet划过,使一个组的每个时期吧。

  ggplot(reserves.tidied, 
         aes(x = lett1, y = rawvalue, alpha = num, fill = lett1)) +
    geom_col(position = position_stack(reverse = T), colour = "black") +
    geom_text(position = position_stack(reverse = T, vjust = .5), 
              aes(label = paste0(lett, ":\n", prop, "%")), alpha = 1) +
    scale_alpha_discrete(range = c(1, .1)) +
    facet_grid(~period) +
    guides(fill = F, alpha = F) 

在此处输入图片说明

Rearranging it so that the "h" bars are different from the "a" and "b" bars is a bit more complex, and you'd have to think about how you want it presented, but it's totally doable. 重新排列它,使“ h”条与“ a”条和“ b”条不同,这有点复杂,您必须考虑要如何显示它,但这是完全可行的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM