简体   繁体   English

R & ggplot2 - 如何通过二进制变量 plot 的相对频率

[英]R & ggplot2 - how to plot relative frequency of a categorical split by a binary variable

I can easily make a relative frequency plot with one 'base' category along the x-axis and the frequency of another categorical being the y:我可以轻松地制作一个相对频率 plot,其中一个“基本”类别沿 x 轴,另一个类别的频率为 y:

library(ggplot2)
ggplot(diamonds) +
  aes(x = cut, fill = color) +
  geom_bar(position = "fill")

Now say I have that categorical variable split in some way by a binary variable:现在说我有一个分类变量以某种方式被一个二进制变量分割:

diamonds <- data.frame(diamonds)
diamonds$binary_dummy <- sample(c(0,1), nrow(diamonds), replace = T)

How do I plot the original categorical but now showing the split in the colour ('color') variable.我如何 plot 原来的分类但现在显示颜色('颜色')变量中的拆分。 Preferably this will be represented by two different shades of the original colour.优选地,这将由原始颜色的两种不同深浅来表示。

Basically I am trying to reproduce this plot:基本上我试图重现这个 plot: Freq_plot_example

As you can see from the legend, each catetory is split by "NonSyn"/"Syn" and each split is coloured as a dark/light shade of another distinct colour (eg "regulatory proteins NonSyn" = dark pink, "regulatory proteins Syn" = light pink).正如您从图例中看到的那样,每个类别都由“NonSyn”/“Syn”拆分,并且每个拆分都被着色为另一种不同颜色的深/浅阴影(例如,“调节蛋白 NonSyn”= 深粉色,“调节蛋白 Syn " = 浅粉色)。

If you don't mind manually setting the palette you could do something like this:如果您不介意手动设置调色板,您可以执行以下操作:

library(ggplot2)
library(colorspace)

df <- data.frame(diamonds)
df$binary_dummy <- sample(c(0,1), nrow(df), replace = T)

pal <- scales::brewer_pal(palette = "Set1")(nlevels(df$color))
pal <- c(rbind(pal, darken(pal, amount = 0.2)))

ggplot(df, aes(x = cut, fill = interaction(binary_dummy, color))) +
  geom_bar(position = "fill") +
  scale_fill_manual(values = pal)

Created on 2020-04-14 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2020 年 4 月 14 日创建

EDIT: To fix interaction-color relations you can set a named palette, eg:编辑:要修复交互颜色关系,您可以设置一个命名调色板,例如:

pal <- setNames(pal, levels(interaction(df$binary_dummy, df$color)))

# Miss a level
df <- df[!(df$binary_dummy == 0 & df$color == "E"),]

ggplot(df, aes(x = cut, fill = interaction(binary_dummy, color))) +
  geom_bar(position = "fill") +
  scale_fill_manual(values = pal)

Alternatively, you can also set the breaks of the scale:或者,您还可以设置刻度的中断:

ggplot(df, aes(x = cut, fill = interaction(binary_dummy, color))) +
  geom_bar(position = "fill") +
  scale_fill_manual(values = pal, breaks = levels(interaction(df$binary_dummy, df$color)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM