简体   繁体   中英

Stacked bar plot based in 4 variables with ggplot2

I have a data frame like this:

nthreads ab_1 ab_2 ab_3 ab_4 ...
1        0    0    0    0    ...
2        1    0    12   1    ...
4        2    1    22   1    ...
8        10   2    103  8    ...

Each ab_X represents different causes that trigger an abort in my code. I want to summarize all abort causes in a barplot displaying nthreads vs aborts with different ab_X stacked in each bar.

I can do

ggplot(data, aes(x=factor(nthreads), y=ab_1+ab_2+ab_3+ab_4)) +
  geom_bar(stat="identity")

But it only gives the total number of aborts. I know there is a fill aes, but I can not make it work with continuous variables.

You have to melt the data frame first

library(data.table)
dt_melt <- melt(data, id.vars = 'nthreads')
ggplot(dt_melt, aes(x = nthreads, y = value, fill = variable)) + 
    geom_bar(stat = 'identity')

It gives the total number of aborts because you are adding them together :)

You need to get your data from wide to long format first, ie create one column for the abort causes and a second for their values. You can use tidyr::gather for that. I also find geom_col more convenient than geom_bar :

library(tidyr)
library(ggplot2)
data %>% 
  gather(abort, value, -nthreads) %>% 
  ggplot(aes(factor(nthreads), value)) + 
    geom_col(aes(fill = abort)) + 
    labs(x = "nthreads", y = "count")

Note that the range of values makes some of the bars rather hard to see, so you might want to think about scales and maybe even facets.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM