簡體   English   中英

如何在 R 中使用 ggplot 根據對 plot 的總計數僅 select 前 N 組

[英]How to only select the top N groups based on total count to plot using ggplot in R

我有一張圖表,顯示了每年不同的十碼組(因素)及其各自的計數。 此處參考圖片(忽略壞軸標簽):

在此處輸入圖像描述

目標是僅根據您在此處看到的總計數顯示前 N 個“Tencodes”,而不是顯示全部 - 然后根據每年的總計數按降序排列,僅顯示影響最大的相關組。

這是我正在嘗試的代碼,但我不斷收到錯誤消息:

complete_df %>%
  count(Tencode.Description, sort = TRUE) %>%
  head(10) %>%
  na.omit() %>%
  mutate(Tencode.Description = fct_reorder(Tencode.Description, n)) %>%
  ggplot(aes(Year, fill = Tencode.Description)) +
  geom_bar(position=position_dodge()) +
  scale_y_log10(labels = comma) +
  #facet_wrap(~ Year) +
  labs(fill = "Tencode") + theme(
    axis.title.x = element_text(color="black", size=15, face="bold"),
    axis.title.y = element_text(color="black", size=15, face="bold"),
    legend.title = element_text(colour = "black", size = 10, face = "bold"),
    legend.text = element_text(colour = "black", size = 10, face = "plain"),
    axis.text.x = element_text(colour = "black", size = 10, face = "plain"),
    axis.text.y = element_text(colour = "black", size = 10, face = "plain")) +
  labs( x="Year", y = "Count")

任何幫助將不勝感激 - 這里也是我的原始數據的摘要



您需要做的是在圖表之前識別top N個並過濾輸入 ggplot 的數據。 這是一個包含一些隨機數據並按top 10過濾的示例

library(dplyr)
library(ggplot2)
library(scales)

# create a sample data with runif for count figures 
set.seed(100)
sample_code <- c("Bom Threat", "Burglary - Non-Residence", "Burglary - Residence",
                 "Community Policing Activity", "Corpse / D. O. A.",
                 "Cutting / Stabbing", "Dangerous / Injured Animal",
                 "Disorderly Person", "Drowning", "Fight / Assault",
                 "Fire", "Hold up / Robbery", "Intoxicated Person", "Theft",
                 "Transport Prisoner / Suspect", "Vehicle Accident",
                 "Missing Person", "Person Indecently Exposed")
code_data <- tibble(
  Year = sort(rep(seq(2015, 2022, by = 1), 18)),
  code = rep(sample_code, 8),
  count = round(runif(18 * 8, 1, 100), digits = 0)
)

# identify top 10 code of all times
top_10_overall <- code_data %>%
  group_by(code) %>%
  summarize(total_count = sum(count), .groups = "drop") %>%
  arrange(desc(total_count)) %>%
  head(10)
top_ten_code <- factor(top_10_overall$code)

# filter data with top 10 codes and convert to factor
to_graph_data <- code_data %>%
  filter(code %in% as.character(top_ten_code)) %>%
  mutate(code = factor(code, levels = levels(top_ten_code)))

# plot the filtered data
ggplot(data = to_graph_data, aes(fill = code)) +
  geom_bar(aes(x= Year, y = count), stat = "identity", position=position_dodge()) +
  labs(fill = "Tencode") + theme(
    axis.title.x = element_text(color="black", size=15, face="bold"),
    axis.title.y = element_text(color="black", size=15, face="bold"),
    legend.title = element_text(colour = "black", size = 10, face = "bold"),
    legend.text = element_text(colour = "black", size = 10, face = "plain"),
    axis.text.x = element_text(colour = "black", size = 10, face = "plain"),
    axis.text.y = element_text(colour = "black", size = 10, face = "plain")) +
  labs( x="Year", y = "Count") +
  scale_y_continuous(expand = c(0, 0))+
  guides(fill = guide_legend(reverse = TRUE))

代表 package (v2.0.1) 於 2022 年 7 月 25 日創建

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM