简体   繁体   中英

How to create a bar plot with one categorical variable in different years in ggplot2?

I have a very large data frame where each row in the first column represents an id with numbers. The other rows have a categorical variable that can be of two types (in this example, A or B), each for a year. Here's a simplified data frame as an example:

id  var2017  var2018  var2019
1     A        B         A
2     B        A         A
3     B        A         B
4     A        A         A
5     A        B         B

I'd like to create a bar plot that contains the count of each type (A and B) for each year, with the bars being grouped by type. I am new with R language, so I've tried to create a plot for the years separately, which works fine, as follows:

graph <– ggplot(data = example) +
        geom_bar(aes(x = var2017))

The problem is I don't know how to put them all together. How can I create a plot with all the types for each year being in the x axis, and the count in the y axis? The id doesn't need to be in the output.

The way to plot multiple columns in ggplot is to first convert the data to long form, which can be done with tidyr::gather . Then you map the column it came from (now stored in the "year" column) to one aesthetic, and the count to another ( geom_bar does this for you by counting the number of rows).

library(tidyverse);  
ggplot(data = example %>%
         gather(year, type, -id)) +
  geom_bar(aes(x = year, fill = type), position = "dodge")

在此处输入图片说明

(Note, I changed the example to make the different years have different counts. Otherwise it's less clear to see if it's working.)

example <- read.table(
  header = T, 
  stringsAsFactors = F,
  text = "id  var2017  var2018  var2019
           1       A        B         A
           2       B        A         A
           3       B        A         B
           4       B        A         A     # var2017 A changed to B
           5       A        B         B")

Similar to the previous answer but using dplyr::count , geom_col and clearer syntax for the pipes:

library(ggplot2)
library(tidyr)
library(dplyr)

example %>% 
  gather(Var, Val, -id) %>% 
  count(Var, Val) %>% 
  ggplot(aes(Var, n)) + 
    geom_col(aes(fill = Val), 
             position = "dodge")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM