简体   繁体   中英

Culmulative count of discrete variable in ggplot2

This is related to Plotting cumulative counts in ggplot2 , but that question was dealing with a continuous variable (rather than discrete).

Here, I have a bar chart

set.seed(2021)
dat <- data.frame(x = c(rpois(100, 1), 7, 10))
ggplot(dat) + geom_bar(aes(x, ..count..))

在此处输入图像描述

I'm trying to plot a cumulative count with

ggplot(dat) + geom_bar(aes(x, cumsum(..count..)))

在此处输入图像描述

There are gaps when there are 'missing values' (ie when x is 5, 6, 7, 9).

Is there a quick and easy way to have a bar chart with gaps filled with bars , ie I will have 11 bars? I could have manually created a data frame with the cumulative counts and plot it as usual, but I'm curious if there's a more elegant way.

You can convert the variable to a factor when plotting.

ggplot(dat) + geom_bar(aes(factor(x), cumsum(..count..)))

I would not call this an "easy" approach but the only one I could come up with so solve your question:

  1. Pre-summarise your dataset using eg dplyr::count

  2. Fill up your dataset with the missing categories using eg tidyr::complete (To this end I first convert x to a factor).

  3. Plot via geom_col

library(ggplot2)
library(dplyr)
library(tidyr)

set.seed(2021)
dat <- data.frame(x = c(rpois(100, 1), 7, 10))
dat <- dat %>% 
  count(x) %>% 
  mutate(x = factor(x, levels = seq(range(x)[1], range(x)[2], by = 1))) %>% 
  tidyr::complete(x, fill = list(n = 0))

ggplot(dat) + geom_col(aes(x, cumsum(n)))

If you'll use stat_bin instead of geom_bar may be that can help..

ggplot(dat) + stat_bin(aes(x, cumsum(..count..)))

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM