[英]Stacked Barplot with percentages of total, divided into groups
我正在尝试使用以下 df plot 堆叠条形图。 我的目标是显示两个不同“组”(IPN、INV)中“标记”的差异分布,以便所有 4 个子组(WT、MUT-i、MUT-p、MUT-d)的总和等于 100%。
最好的方法是什么?
structure(list(Marker = c("p16", "p16", "p16", "p16", "p16",
"p16", "p16", "p16", "p53", "p53", "p53", "p53", "p53", "p53",
"p53", "p53", "c-MET", "c-MET", "c-MET", "c-MET", "c-MET", "c-MET",
"c-MET", "c-MET", "c-MYC", "c-MYC", "c-MYC", "c-MYC", "c-MYC",
"c-MYC", "c-MYC", "c-MYC", "EGFR", "EGFR", "EGFR", "EGFR", "EGFR",
"EGFR", "EGFR", "EGFR", "HER2-CISH", "HER2-CISH", "HER2-CISH",
"HER2-CISH", "HER2-CISH", "HER2-CISH", "HER2-CISH", "HER2-CISH",
"PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%",
"PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 IC1%", "PD-L1 TPS1%", "PD-L1 TPS1%",
"PD-L1 TPS1%", "PD-L1 TPS1%", "PD-L1 TPS1%", "PD-L1 TPS1%", "PD-L1 TPS1%",
"PD-L1 TPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%",
"PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%", "PD-L1 CPS1%"),
Group = c("IPN", "IPN", "IPN", "IPN", "INV", "INV", "INV",
"INV", "IPN", "IPN", "IPN", "IPN", "INV", "INV", "INV", "INV",
"IPN", "IPN", "IPN", "IPN", "INV", "INV", "INV", "INV", "IPN",
"IPN", "IPN", "IPN", "INV", "INV", "INV", "INV", "IPN", "IPN",
"IPN", "IPN", "INV", "INV", "INV", "INV", "IPN", "IPN", "IPN",
"IPN", "INV", "INV", "INV", "INV", "IPN", "IPN", "IPN", "IPN",
"INV", "INV", "INV", "INV", "IPN", "IPN", "IPN", "IPN", "INV",
"INV", "INV", "INV", "IPN", "IPN", "IPN", "IPN", "INV", "INV",
"INV", "INV"), Subgroup = c("WT", "MUT-i", "MUT-p", "MUT-d",
"WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p",
"MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i",
"MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT",
"MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d",
"WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p",
"MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i",
"MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d", "WT",
"MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d",
"WT", "MUT-i", "MUT-p", "MUT-d", "WT", "MUT-i", "MUT-p",
"MUT-d", "WT", "MUT-i", "MUT-p", "MUT-d"), `Number of Cases` = c(59,
0, 1, 5, 42, 0, 0, 1, 42, 2, 3, 18, 27, 1, 2, 12, 7, 15,
11, 23, 14, 9, 10, 12, 56, 0, 1, 8, 41, 1, 0, 3, 17, 16,
11, 20, 18, 12, 10, 6, 60, 0, 0, 4, 44, 0, 0, 2, 60, 1, 1,
4, 42, 0, 0, 0, 63, 0, 0, 2, 39, 1, 0, 2, 48, 4, 4, 9, 31,
3, 1, 7)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-72L))type here
客观的,或类似的东西......无论你认为哪个是数据的一个很好的代表
非常感谢!
更新:请求删除0%
:删除第一个解决方案:
删除 0% 的一种方法是对数据进行子集化以在ggtext
中进行标记。
library(tidyverse)
df1 <- df %>%
mutate(Marker = fct_relevel(Marker, c("p16", "p53", "c-MET", "c-MYC")),
Subgroup = fct_relevel(Subgroup, c("WT", "MUT-i", "MUT-p", "MUT-d"))) %>%
group_by(Group, Marker, Subgroup) %>%
summarise(sum=sum(`Number of Cases`)) %>%
# group_by(Subgroup) %>% #depends on what percent you want
mutate(pct= prop.table(sum) * 100)
ggplot(df1, aes(Marker, pct, fill=Subgroup)) +
geom_col(position = position_stack()) +
ylab("Percentage") +
geom_text(data = df1 %>% filter(pct != 0),
aes(label=paste0(sprintf("%1.1f", pct),"%")),
position=position_stack(vjust=0.5)) +
facet_wrap(. ~ Group)+
ggtitle("My title") +
theme_bw()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.