Y 軸值大於 geom_bar 中的實際值

Question

我正在嘗試從以下數據集中創建一個條形 plot。

library(tidyverse)
library(janitor) 
library(lubridate)

product <- read_csv(
  "https://s3.us-west-2.amazonaws.com/public.gamelab.fun/dataset/Al-Bundy_raw-data.csv"
)

product <- product %>% 
  janitor::clean_names() %>% # this function cleans the names of the variables
  dplyr::rename_all(toupper)

當我運行以下代碼時，我得到了酒吧 plot -

product %>% 
  count(SIZE_US, GENDER) %>% 
  pivot_wider(
    names_from = "GENDER",
    values_from = "n"
  ) %>% 
  rename_all(toupper) %>%
  replace(is.na(.),0) %>% 
  mutate(
    TOTAL_SALES = FEMALE + MALE
  ) %>% 
  pivot_longer(
    cols = c("FEMALE", "MALE"),
    names_to = "GENDER",
    values_to = "GENDERSALES"
  )%>% 
  ggplot(aes(x=reorder(SIZE_US,as.numeric(SIZE_US)),y= TOTAL_SALES, fill = GENDER))+
  geom_bar(stat = "identity")+
  labs(x = "SHOE SIZE",
       y = "TOTAL SALES",
       title = "SALES OF DIFFERENT SIZES OF SHOE")+
  geom_text(
    aes(label = GENDERSALES), 
    position = position_stack(vjust = 0.5), 
    color = "white", 
    size = 2
  )

但問題是 Y 軸的值大於數據中的實際值。 例如，在條形 plot 中，它顯示 Y 軸值大於 4000，但在數據中 y 軸的實際最高值是 2346。我將以下內容添加為最后一個代碼的列表行 -

scale_y_continuous(limits=c(0,2500),oob = rescale_none)

但是條形圖 plot 中的一半條形不在圖中。

Answer 1

堆積條形圖用於顯示類別如何划分以及各部分對總值的關系。 條形的總值是類別的總和。

在您的情況下，您有兩個類別（男性和女性），其最大值為 2346。根據圖形定義，它應該將所有類別顯示在單個條中，這就是為什么 Y 軸上的值大於 4000。

您可以通過兩種方式解決此問題。 一種是刪除 Y 軸文本並僅顯示關系

product %>% 
  count(SIZE_US, GENDER) %>% 
  pivot_wider(
    names_from = "GENDER",
    values_from = "n"
  ) %>% 
  rename_all(toupper) %>%
  replace(is.na(.),0) %>% 
  mutate(
    TOTAL_SALES = FEMALE + MALE
  ) %>% 
  pivot_longer(
    cols = c("FEMALE", "MALE"),
    names_to = "GENDER",
    values_to = "GENDERSALES"
  ) -> plot_data 

plot_data %>% 
  ggplot(aes(x=reorder(SIZE_US,as.numeric(SIZE_US)),y= as.numeric(TOTAL_SALES), fill = GENDER))+
  geom_bar(stat = "identity") +
  labs(x = "SHOE SIZE",
       y = "TOTAL SALES",
       title = "SALES OF DIFFERENT SIZES OF SHOE") +
  geom_text(
    aes(label = GENDERSALES), 
    position = position_stack(vjust = 0.5), 
    color = "white", 
    size = 2
  ) +
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

其次是使用分組條形圖而不是堆疊條形圖

plot_data %>% 
  ggplot(aes(x=reorder(SIZE_US,as.numeric(SIZE_US)),y= as.numeric(TOTAL_SALES), fill = GENDER))+
  geom_col(position = "dodge2") +
  labs(x = "SHOE SIZE",
       y = "TOTAL SALES",
       title = "SALES OF DIFFERENT SIZES OF SHOE") +
  geom_text(
    aes(label = GENDERSALES), 
    position = position_dodge2(width = .9), 
    color = "white", 
    size = 2,
    vjust = -0.5 
  )

Y 軸值大於 geom_bar 中的實際值

問題描述

1 個解決方案

解決方案1
0 2021-07-15 13:55:28

Y 軸值大於 geom_bar 中的實際值

問題描述

1 個解決方案

解決方案1 0 2021-07-15 13:55:28

解決方案1
0 2021-07-15 13:55:28