When plotting small multiples for categorical variables, I used the following code:
ggplot(raw, aes(x = income)) +
geom_bar(aes(y = ..count../sum(..count..), fill = factor(..x..))) +
facet_wrap("workclass")
However, for each wrap, it gives me the frequency of current data points on the total size of the dataset, not only in the facet_wrap subset.
What change would I need to make in this code so that the count operates only in the face_wrap subset?
You need to reformulate the data (ie create percentage data by workclass
group before calling ggplot()
). Here is a data.table way to do this.
require(data.table)
rawdt <- data.table(raw)
new_data <- rawdt[, .N, by = .(income, workclass)][, classN := sum(N), by = workclass][, y := N/classN]
ggplot(new_data, aes(x = income, y = y)) + geom_bar(stat = "identity") +
facet_wrap(~workclass)
You could use dplyr
For example, your code on the mtcars
dataset:
ggplot(mtcars,aes(x = gear)) +
geom_bar(aes(y = ..count../sum(..count..), fill = factor(..x..))) +
facet_wrap("cyl")
Reformulating the data like @amatsuo_net's solution but with dplyr
:
library(dplyr)
mtcars2 <- inner_join(mtcars %>%
group_by(cyl) %>%
summarise(total = n()),
mtcars %>%
group_by(gear,cyl) %>%
summarise(sub_total = n()),
by = "cyl") %>%
mutate(prop = sub_total/total)
ggplot(data = mtcars2, aes(x = gear,y=prop)) +
geom_bar(stat = "identity") +
facet_wrap(~cyl)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.