简体   繁体   中英

Using frequency on subgroups of ggplot/facet_wrap()

When plotting small multiples for categorical variables, I used the following code:

ggplot(raw, aes(x = income)) +
  geom_bar(aes(y = ..count../sum(..count..), fill = factor(..x..))) +
  facet_wrap("workclass")

However, for each wrap, it gives me the frequency of current data points on the total size of the dataset, not only in the facet_wrap subset.

What change would I need to make in this code so that the count operates only in the face_wrap subset?

You need to reformulate the data (ie create percentage data by workclass group before calling ggplot() ). Here is a data.table way to do this.

require(data.table)
rawdt <- data.table(raw)
new_data <- rawdt[, .N, by = .(income, workclass)][, classN := sum(N), by = workclass][, y := N/classN]
ggplot(new_data, aes(x = income, y = y)) + geom_bar(stat = "identity") + 
  facet_wrap(~workclass)

You could use dplyr

For example, your code on the mtcars dataset:

ggplot(mtcars,aes(x = gear)) +
  geom_bar(aes(y = ..count../sum(..count..), fill = factor(..x..))) + 
  facet_wrap("cyl")

Reformulating the data like @amatsuo_net's solution but with dplyr :

library(dplyr)
mtcars2 <- inner_join(mtcars %>% 
                       group_by(cyl) %>% 
                       summarise(total = n()),
                      mtcars %>% 
                       group_by(gear,cyl) %>% 
                       summarise(sub_total = n()),
                  by = "cyl") %>%
            mutate(prop = sub_total/total)

ggplot(data = mtcars2, aes(x = gear,y=prop)) +
  geom_bar(stat = "identity") + 
  facet_wrap(~cyl)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM