简体   繁体   中英

Draw a bar plot with date and two axis - R ggplot2

Please, help in guiding to the correct approach on drawing a plot...

I have a dataset with some monuments and the year they were inscribed as municipal heritage

monument year
A 1990
B 1990
C 1993
D 1995
E 1996

All monuments are different and unique, but there are some years in common.

I would like to visualize the xx axis with all the years, a bar plot with the count of monuments inscribed in each year (and show even those years that doesn't have any monuments inscribed to visualize the the gaps in time)

also, it would be awesome to have a secondary axis, and draw a line with the accumulated sum of the monuments inscribed..

the final result would be something similar to this

Thanks in advance!

The general thing to remember with secondary axes in ggplot2 is that (1) you need to transform the input data yourself and (2) your need to specify the inverse transform in the secondary axis. Here is an example with some dummy data, where we simply use a scaling factor of 10 .

library(ggplot2)

df <- data.frame(
  year = sample(1990:2020, 50, replace = TRUE)
)

scale <- 10 # scaling factor for secondary axis

ggplot(df, aes(year)) +
  geom_bar(width = 0.5) +
  geom_line(aes(y = after_stat(cumsum(count)/scale)),
            stat = "count", colour = "red") +
  scale_y_continuous(
    sec.axis = sec_axis(~ .x * scale, name = "cumulative count")
  )

Created on 2021-02-03 by the reprex package (v1.0.0)

Perhaps also useful to point out, is that you can get the cumulative counts per year with aes(y = after_stat(cumsum(counts)) .

So I've always found creating a secondary axis in ggplot2 to be non-intuitive (which is by design - ggplot2 package authors discourage secondary axes because they are often misinterpreted). However, if they must be used, the echarts4r package has a straightforward solution.

library(echarts4r)
library(dplyr)
library(zoo)

d <- data.frame(
  monument = c("A","B","C","D","E"),
  year = c(1990, 1990, 1993, 1995, 1996)) 

plot_dat <-
  data.frame(year = seq.int(min(d$year), max(d$year))) %>%
  left_join(d %>%
            group_by(year) %>%
            summarize(cnt = n()) %>%
            mutate(cum_cnt = cumsum(cnt))
  ) %>%
  mutate(year = paste(year),
         cum_cnt = na.locf(cum_cnt),
         show = T)
  

plot_dat %>%
  e_charts(year) %>%
  e_bar(cnt) %>%
  e_add("label", show) %>%
  e_line(cum_cnt, y_index = 1) %>% 
  e_hide_grid_lines("y")

The code above produces this result . I made the executive decision to only show y-axis gridlines for the secondary axis since the bars are easily annotated with labels.

Thanks for posting! I wanted a good excuse to learn echarts4r !

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM