[英]Y axis values are greater than actual values in geom_bar
我正在嘗試從以下數據集中創建一個條形 plot。
library(tidyverse)
library(janitor)
library(lubridate)
product <- read_csv(
"https://s3.us-west-2.amazonaws.com/public.gamelab.fun/dataset/Al-Bundy_raw-data.csv"
)
product <- product %>%
janitor::clean_names() %>% # this function cleans the names of the variables
dplyr::rename_all(toupper)
當我運行以下代碼時,我得到了酒吧 plot -
product %>%
count(SIZE_US, GENDER) %>%
pivot_wider(
names_from = "GENDER",
values_from = "n"
) %>%
rename_all(toupper) %>%
replace(is.na(.),0) %>%
mutate(
TOTAL_SALES = FEMALE + MALE
) %>%
pivot_longer(
cols = c("FEMALE", "MALE"),
names_to = "GENDER",
values_to = "GENDERSALES"
)%>%
ggplot(aes(x=reorder(SIZE_US,as.numeric(SIZE_US)),y= TOTAL_SALES, fill = GENDER))+
geom_bar(stat = "identity")+
labs(x = "SHOE SIZE",
y = "TOTAL SALES",
title = "SALES OF DIFFERENT SIZES OF SHOE")+
geom_text(
aes(label = GENDERSALES),
position = position_stack(vjust = 0.5),
color = "white",
size = 2
)
但問題是 Y 軸的值大於數據中的實際值。 例如,在條形 plot 中,它顯示 Y 軸值大於 4000,但在數據中 y 軸的實際最高值是 2346。我將以下內容添加為最后一個代碼的列表行 -
scale_y_continuous(limits=c(0,2500),oob = rescale_none)
但是條形圖 plot 中的一半條形不在圖中。
堆積條形圖用於顯示類別如何划分以及各部分對總值的關系。 條形的總值是類別的總和。
在您的情況下,您有兩個類別(男性和女性),其最大值為 2346。根據圖形定義,它應該將所有類別顯示在單個條中,這就是為什么 Y 軸上的值大於 4000。
您可以通過兩種方式解決此問題。 一種是刪除 Y 軸文本並僅顯示關系
product %>%
count(SIZE_US, GENDER) %>%
pivot_wider(
names_from = "GENDER",
values_from = "n"
) %>%
rename_all(toupper) %>%
replace(is.na(.),0) %>%
mutate(
TOTAL_SALES = FEMALE + MALE
) %>%
pivot_longer(
cols = c("FEMALE", "MALE"),
names_to = "GENDER",
values_to = "GENDERSALES"
) -> plot_data
plot_data %>%
ggplot(aes(x=reorder(SIZE_US,as.numeric(SIZE_US)),y= as.numeric(TOTAL_SALES), fill = GENDER))+
geom_bar(stat = "identity") +
labs(x = "SHOE SIZE",
y = "TOTAL SALES",
title = "SALES OF DIFFERENT SIZES OF SHOE") +
geom_text(
aes(label = GENDERSALES),
position = position_stack(vjust = 0.5),
color = "white",
size = 2
) +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank())
其次是使用分組條形圖而不是堆疊條形圖
plot_data %>%
ggplot(aes(x=reorder(SIZE_US,as.numeric(SIZE_US)),y= as.numeric(TOTAL_SALES), fill = GENDER))+
geom_col(position = "dodge2") +
labs(x = "SHOE SIZE",
y = "TOTAL SALES",
title = "SALES OF DIFFERENT SIZES OF SHOE") +
geom_text(
aes(label = GENDERSALES),
position = position_dodge2(width = .9),
color = "white",
size = 2,
vjust = -0.5
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.