简体   繁体   中英

Reorder factor in mosaic plot in R with ggmosaic and geom_mosaic()

I have tried to familiarise myself with making mosaic plots in R with the ggmosaic package's geom_mosaic() command.

My problem is that I want the regions to be ordered by the share of seniors in each region, and not by name as now. Any help?

I am not very used to work with factors, but I have tried to do different things with forecat's fct_reorder() command without any luck.

Here is a sample dataset (not the actual dataset I work with) and the code I have made so far:

# install.packages(c("ggplot2", "ggmosaic"))
library(ggplot2)
library(ggmosaic)
  
# Make data set      
region <- c("Oslo", "Oslo", "Oslo", "Viken", "Viken", "Viken", 
            "Nordland", "Nordland", "Nordland")
age    <- c("young", "adult", "senior", "young", "adult", "senior",
            "young", "adult", "senior")
pop    <- c(145545, 462378, 89087, 299548, 729027, 223809, 52156, 136872, 51317)
df     <- data.frame(region, age, pop)

# Make mosaic plot
ggplot(data = df) +
  geom_mosaic(aes(x = product(age, region), fill = age, weight = pop)) +
  coord_flip() +
  theme_minimal()

UPDATE: Sorry If I was unclear but what I wanted was this:

mosaic plot ranked

Where the regions are ranked/ordered by the share of seniors rather than the default order, like this:

mosaic plot unranked

I solved it somehow by using the fct_reorder() command in an 'untidy' way rather than as part of a mutate command in a pipeline. I have no idea why that meant any difference. Another comment, the fct_reorder() command works fine within a regular ggplot2 geom_... command but not (at least the way I tried) in the geom_mosaic command from the ggmosaic package.

NEW CODE (which is way too verbose to estimate the share of seniors)

# install.packages(c("ggplot2", "ggmosaic"))
library(ggplot2)
library(ggmosaic)

# Make data set      
region <- c("Oslo", "Oslo", "Oslo", "Viken", "Viken", "Viken", 
            "Nordland", "Nordland", "Nordland")
age    <- c("young", "adult", "senior", "young", "adult", "senior",
            "young", "adult", "senior")
pop    <- c(145545, 462378, 89087, 299548, 729027, 223809, 52156, 136872, 51317)
df     <- data.frame(region, age, pop)

df <- df %>% 
  group_by(region, age) %>%
  summarise(pop = sum(pop)) %>% 
  mutate(senior = case_when(age == "senior" ~ pop))

# Get total population of each region
df_tot <- df %>% 
  group_by(region) %>% 
  summarise(poptot = sum(pop),
            senior = median(senior, na.rm = TRUE)) %>% 
  mutate(senior_share = senior / poptot * 100) %>% 
  select(region, senior_share)

# Estimate senior share of each region
# Change order of regions
df <- df %>% 
  left_join(df_tot, by = "region") #%>% 

# Fix the factors
df$region <- fct_reorder(df$region, df$senior_share)
df$age <- factor(df$age, levels = c("young", "adult", "senior"))

# Make mosaic plot
ggplot(data = df) +
  geom_mosaic(aes(x = product(age, region), fill = age, weight = pop)) +
  coord_flip() +
  theme_minimal()

use this code, to set sequence,

df$age <- factor(df$age, levels = c("senior","adult","young"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM