简体   繁体   中英

How to order the bar based on bar values in geom_bar?

This is where I get my dataset and c

board_game_original<- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project1_bgdataviz/board_game_raw.csv")

#tidy up the column of mechanic and category with cSplit function
library(splitstackshape)
mechanic <- board_game$mechanic
board_game_tidy <- cSplit(board_game,splitCols=c("mechanic","category"), sep = ",", direction = "long")

I am trying to make the graph more organized by ordering the bar by the values of the bar on the y-axis. I tried using the reorder function but still does not work. Does anyone have any suggestions? I am quite new to R and hope to learn more!

library(ggplot2)
average_complexity <- board_game_tidy %>% 
            filter(yearpublished >= 1950, users_rated >= 25, average_complexity>0 ) %>%
            select(average_complexity)
category_complexity_graph <- ggplot(data=board_game_tidy, aes(x = reorder(category, -average_complexity), y = average_complexity, na.rm = TRUE)) + 
        geom_bar(stat = "identity", na.rm = TRUE, color="white",fill="sky blue") + 
        ylim(0,5) +
        theme_bw() +
        ggtitle("Which category of board games has the highest level of average complexity") +
        xlab("category of board games") +
        ylab("average complexity of the board game") +
        theme(axis.text.x = element_text(size=5, angle = 45)) +
        theme(plot.title = element_text(hjust = 0.5)) 
category_complexity_graph

Here's the graph I plot: 在此处输入图片说明 "Category" is a categorical variable and "average complexity" is a continuous variable.

I was trying to answer the question "which category has the highest average complexity?" but this graph looks messy and any suggestion of cleaning it up would be appreciated as well! Thank you all

Maybe this is what you are looking for. The issue is not about reordering, the issue is about preparing your data. (; Put differently the reordering by the average does not give you a nice plot, because you have multiple obs. per category and more importantly a different number of obs. per category. When you do a barplot with this dataset all these obs. get stacked, ie your plot is show the sum of average complexities. Hence, to achieve your desired result your have to first summarise your dataset by category. After doing so, your reordering code works and gives you a nice plot.

However, I would suggest to flip the axes which makes the labels easier to read:

board_game_original<- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project1_bgdataviz/board_game_raw.csv")

#tidy up the column of mechanic and category with cSplit function
library(splitstackshape)
board_game <- board_game_original
mechanic <- board_game$mechanic
board_game_tidy <- cSplit(board_game,splitCols=c("mechanic","category"), sep = ",", direction = "long")

library(ggplot2)
library(dplyr)
# Summarise your dataset
board_game_tidy1 <- board_game_tidy %>% 
  as_tibble() %>% 
  filter(yearpublished >= 1950, users_rated >= 25, average_complexity > 0, !is.na(category)) %>%
  group_by(category) %>% 
  summarise(n = n(), average_complexity = mean(average_complexity, na.rm = TRUE))

ggplot(data=board_game_tidy1, aes(x = reorder(category, average_complexity), y = average_complexity, na.rm = TRUE)) + 
  geom_bar(stat = "identity", na.rm = TRUE, color="white",fill="sky blue") + 
  ylim(0,5) +
  theme_bw() +
  ggtitle("Which category of board games has the highest level of average complexity") +
  xlab("category of board games") +
  ylab("average complexity of the board game") +
  #theme(axis.text.x = element_text(size=5, angle = 45)) +
  theme(plot.title = element_text(hjust = 0.5)) +
  coord_flip()
 

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM