简体   繁体   English

ggplot堆叠条形图,其条形图与两个不同的变量(百分比)相关

[英]ggplot stacked bar graph with bars relating to two different variables with percentages

I would like to create a stacked bar graph with ggplot where the heights of the bars depend on the values of one variable (voter turnout in %) and the stacks of the bars individually add up to 100% of another variable (voteshare in %). 我想用ggplot创建一个堆叠的条形图,其中条形的高度取决于一个变量的值(选民投票率,以%为单位),条形图的堆栈分别加起来另一个变量的百分比为100%(选股,%) 。 So for the year 1990 there was a voter turnout of 96.7 and the bar should be filled with the individual voteshares of each party, which add up to 100% (of the 96.7%). 因此,在1990年,选民的投票率为96.7,应该用每个党派的个人投票份额来填满选民,总数达到96.7%的100%。 I look at the data of 3 parties and 3 years. 我看了3方和3年的数据。

Here is my data: 这是我的数据:

party <- c("a", "b", "c", "a", "b", "c", "a", "b", "c") 
year <- c(1990, 1990, 1990, 1991, 1991, 1991, 1992,1992, 1992)
voteshare <- c(0,33.5, 66.5, 40.5, 39.0, 20.5, 33.6, 33.4, 33)
turnout = c(96.7,96.7,96.7, 85.05,85.05,85.05, 76.41, 76.41, 76.41)
df<- data.frame(parties, year, voteshare, turnout)

In addition, I would like to put the numbers of the individual voteshares and the total turnout inside the graph. 另外,我想将各个投票份额的数字和总投票数放在图表内。

My approach so far: 到目前为止,我的方法:

ggplot(df, aes(x=year, y=interaction(turnout, voteshare), fill=party)) + 
    geom_bar(stat="identity", position=position_stack()) +
    geom_text(aes(label=Voteshare), vjust=0.5)

It's a mess. 一团糟。

Thanks a ton in advance! 在此先感谢一吨!

I used a dplyr pipeline to: 我使用dplyr管道执行以下操作:

  • create a column for adjusted vote total which is the product of each party's share and total turnout. 为调整后的投票总数创建一列,该列是每一方的份额和投票总数的乘积。
  • get rid of the zero rows so no zeros appear on the final output 摆脱零行,所以最终输出中不出现零
  • calculate the y value where the vote total should be displayed by taking the cumsum() of vote share by party, grouped by year. 通过按年份分组各方投票份额的cumsum() ,计算应显示总投票数的y值。 I had to use rev() because the default of position_stack() is to put the low number in alphabetical order at the top of the stack. 我必须使用rev()因为position_stack()的默认值是将低位数字按字母顺序放在堆栈的顶部。

Code

library(dplyr)
library(ggplot2)

df <- df %>%
  mutate(adj_vote = turnout * voteshare / 100) %>%
  filter(adj_vote > 0) %>%
  group_by(year) %>% 
  mutate(cum_vote = cumsum(rev(adj_vote)),
         vote_label = rev(voteshare))


ggplot(df, aes(x=year, y=adj_vote, fill=party)) + 
  geom_bar(stat="identity", position=position_stack()) +
  geom_text(aes(label=vote_label, y = cum_vote), vjust=0.5)

Output 产量

ggplot2输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM