简体   繁体   中英

How to make a stacked bar plot in R with the data from a dataframe?

I have a table that looks as follows:

family max mean
OG0000000 1336 348.23423423423424
OG0000001 152 66.31531531531532
OG0000002 104 33.85585585585586
OG0000003 133 32.990990990990994
OG0000004 118 31.135135135135137
OG0000005 79 30.83783783783784
OG0000006 123 30.153153153153152
OG0000007 131 29.81081081081081
OG0000008 129 29.684684684684683
OG0000009 93 29.405405405405407
OG0000011 169 28.35135135135135
OG0000012 73 27.56756756756757
OG0000013 113 27.504504504504503
OG0000014 87 24.72972972972973
OG0000015 106 24.675675675675677
OG0000017 131 23.306306306306308
OG0000018 80 22.81081081081081
OG0000020 234 21.324324324324323
OG0000024 89 20.89189189189189

I want to make a bar plot where each bar takes the data in column one('family') as the label and stacks the data in columns two and three for each bar. Say the datapoints from column two make up the primary data show in the bars and the data from the third column is overlaid over it.

I am trying to do this in ggplot.

I have tried the following approach (as taken from this post ):

df_long <- df2 %>% gather(family,counts, 2:3) # here df is the above table as a dataframe
ggplot(df_long, aes(x = family, y = counts, fill = family)) + geom_col(position = position_stack())

This gives an output in the following design, and this is really not what I want.

此代码生成的图像

Not sure if the code in the linked post is outdated or if I am doing something wrong.

I see some random webpages here (for example) and there showing syntax like this barplot(Values, main = "total revenue", names.arg = months, xlab = "month", ylab = "revenue", col = colors) to make bar plots but here the values object needs to be a vector (which I am not sure I how to pass two sets of data to.)

Searching around on google, it seems stacked bar plots are a bit hard to track down. Unfortunately I am not the most familiar with the grammar of graphics , anyone have any pointers?

If you want to stack the max and mean together for each family, then you can do something like this:

library(tidyverse)

df2 %>%
  pivot_longer(-family) %>%
  ggplot(aes(x = family, y = value, fill = name)) +
  geom_col(position = position_stack()) +
  theme(axis.text.x = element_text(angle = 90))

Output

在此处输入图像描述

Another option (rather than mixing stats) would be to use facet_wrap , so that you mean in one graph and max in another:

df2 %>%
  pivot_longer(-family) %>%
  ggplot(aes(x = family, y = value)) +
  geom_col(position = position_stack()) +
  scale_y_continuous(breaks = seq(0, 1400, 200),
                     limits = c(0, 1400)) +
  facet_wrap( ~ name, scales = "free_y") +
  theme(axis.text.x = element_text(angle = 90))

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM