简体   繁体   English

堆叠条形图占总数的百分比,分为几组

[英]Stacked Barplot with percentages of total, divided into groups

I am trying to plot a stacked barplot using the following df.我正在尝试使用以下 df plot 堆叠条形图。 My goal is to show the differential distribution of the "Marker"s in two different "Group"s (IPN, INV), so that the sum of all 4 subgroups (WT, MUT-i, MUT-p, MUT-d) equals a 100%.我的目标是显示两个不同“组”(IPN、INV)中“标记”的差异分布,以便所有 4 个子组(WT、MUT-i、MUT-p、MUT-d)的总和等于 100%。

What is the best approach to do this?最好的方法是什么?

structure(list(Marker = c("p16", "p16", "p16", "p16", "p16", 
"p16", "p16", "p16", "p53", "p53", "p53", "p53", "p53", "p53", 
"p53", "p53", "c-MET", "c-MET", "c-MET", "c-MET", "c-MET", "c-MET", 
"c-MET", "c-MET", "c-MYC", "c-MYC", "c-MYC", "c-MYC", "c-MYC", 
"c-MYC", "c-MYC", "c-MYC", "EGFR", "EGFR", "EGFR", "EGFR", "EGFR", 
"EGFR", "EGFR", "EGFR", "HER2-CISH", "HER2-CISH", "HER2-CISH", 
"HER2-CISH", "HER2-CISH", "HER2-CISH", "HER2-CISH", "HER2-CISH", 
"PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", 
"PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 TPS1%", "PD-L1 TPS1%", 
"PD-L1 TPS1%", "PD-L1 TPS1%", "PD-L1 TPS1%", "PD-L1 TPS1%", "PD-L1 TPS1%", 
"PD-L1 TPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", 
"PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%"), 
    Group = c("IPN", "IPN", "IPN", "IPN", "INV", "INV", "INV", 
    "INV", "IPN", "IPN", "IPN", "IPN", "INV", "INV", "INV", "INV", 
    "IPN", "IPN", "IPN", "IPN", "INV", "INV", "INV", "INV", "IPN", 
    "IPN", "IPN", "IPN", "INV", "INV", "INV", "INV", "IPN", "IPN", 
    "IPN", "IPN", "INV", "INV", "INV", "INV", "IPN", "IPN", "IPN", 
    "IPN", "INV", "INV", "INV", "INV", "IPN", "IPN", "IPN", "IPN", 
    "INV", "INV", "INV", "INV", "IPN", "IPN", "IPN", "IPN", "INV", 
    "INV", "INV", "INV", "IPN", "IPN", "IPN", "IPN", "INV", "INV", 
    "INV", "INV"), Subgroup = c("WT", "MUT-i", "MUT-p", "MUT-d", 
    "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", 
    "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", 
    "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT", 
    "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", 
    "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", 
    "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", 
    "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT", 
    "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", 
    "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", 
    "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d"), `Number of Cases` = c(59, 
    0, 1, 5, 42, 0, 0, 1, 42, 2, 3, 18, 27, 1, 2, 12, 7, 15, 
    11, 23, 14, 9, 10, 12, 56, 0, 1, 8, 41, 1, 0, 3, 17, 16, 
    11, 20, 18, 12, 10, 6, 60, 0, 0, 4, 44, 0, 0, 2, 60, 1, 1, 
    4, 42, 0, 0, 0, 63, 0, 0, 2, 39, 1, 0, 2, 48, 4, 4, 9, 31, 
    3, 1, 7)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-72L))type here

Objective, or something similar to it... Whichever you think is a good representation of the data客观的,或类似的东西......无论你认为哪个是数据的一个很好的代表

在此处输入图像描述

Thanks a lot!非常感谢!

Update: request removing 0% : removed first solution:更新:请求删除0% :删除第一个解决方案:

One way to remove the 0% is to subset the data for labeling in ggtext .删除 0% 的一种方法是对数据进行子集化以在ggtext中进行标记。

library(tidyverse)

df1 <- df %>% 
  mutate(Marker = fct_relevel(Marker, c("p16", "p53", "c-MET", "c-MYC")),
         Subgroup = fct_relevel(Subgroup, c("WT", "MUT-i", "MUT-p", "MUT-d"))) %>% 
  group_by(Group, Marker, Subgroup) %>% 
  summarise(sum=sum(`Number of Cases`)) %>% 
  # group_by(Subgroup) %>% #depends on what percent you want
  mutate(pct= prop.table(sum) * 100) 
  
ggplot(df1, aes(Marker, pct, fill=Subgroup)) +
  geom_col(position = position_stack()) +
  ylab("Percentage") +
  geom_text(data = df1 %>% filter(pct != 0),
            aes(label=paste0(sprintf("%1.1f", pct),"%")),
            position=position_stack(vjust=0.5)) +
  facet_wrap(. ~  Group)+
  ggtitle("My title") +
  theme_bw()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM