在 geom_bar 中同时使用填充和组参数时 ggplot2 出错

Question

当我在条形 plot ( geom_bar() ) 中包含填充和组参数时，R 的ggplot2库似乎存在问题。 我已经尝试了几个小时寻找答案，但找不到有帮助的答案。 这实际上是我在这里的第一篇文章。

提供一点背景知识，我有一个名为smokement （烟雾和心理健康的缩写）的Z6A8064B5DF4794555500553C47C55057DZ，一个名为smoke100 （过去100天吸烟？）的分类变量，带有“是”和“否”，以及另一个名为misnervs分类变量（紧张感的频率）有 5 个可能的值：“全部”、“大多数”、“一些”、“一点”和“无”。

当我运行这段代码时，我得到了这个结果：

ggplot(data = smokement) + 
geom_bar(aes(x = smoke100, fill = smoke100)) + 
facet_wrap(~misnervs, nrow = 1)

第一个代码输出

但是，我想要的结果是让所有分组的条形图显示它们各自的比例。 通过阅读一些“R for Data Science”一书，我发现我需要在aes()中包含y =..prop..和group = 1来实现它：

ggplot(data = smokement) + 
geom_bar(aes(x = smoke100, y = ..prop.., group = 1)) + 
facet_wrap(~misnervs, nrow = 1)

第二个代码输出

最后，我尝试使用aes()中的fill = smoke100参数以颜色显示这个分类变量，就像我在第一个代码中所做的那样。 但是当我添加这个填充参数时，它不起作用，代码运行，但它显示与第二个代码完全相同的output，好像这次填充参数被忽略了！

ggplot(data = smokement) +
geom_bar(aes(x = smoke100, y = ..prop.., group = 1, fill = smoke100)) +
facet_wrap(~misnervs, nrow = 1)

第三个代码输出

有谁知道为什么会发生这种情况，以及如何解决它？ 我的最终目标是使用 colors 和右侧的图例显示smoke100 的每个值（“是”和“否”条），就像在第一张图上一样，同时让每个分组级别的“misnervs”显示它们各自的比例Smoke100（“是”，“否”）级别，就像在第二张图上一样。

编辑：

> dim(smokement)
[1] 35471     6
> str(smokement)
'data.frame':   35471 obs. of  6 variables:
 $ smoke100: Factor w/ 2 levels "Yes","No": 1 2 1 2 1 1 1 1 1 1 ...
 $ misnervs: Factor w/ 5 levels "All","Most","Some",..: 3 4 5 4 1 5 3 3 5 5 ...
 $ mishopls: Factor w/ 5 levels "All","Most","Some",..: 3 5 5 5 5 5 5 5 5 5 ...
 $ misrstls: Factor w/ 5 levels "All","Most","Some",..: 3 5 5 3 1 5 3 5 1 5 ...
 $ misdeprd: Factor w/ 5 levels "All","Most","Some",..: 5 5 5 5 4 5 5 5 5 5 ...
 $ miswtles: Factor w/ 5 levels "All","Most","Some",..: 5 5 5 5 5 5 5 5 5 5 ...
> head(smokement)
  smoke100 misnervs mishopls misrstls misdeprd miswtles
1      Yes     Some     Some     Some     None     None
2       No A little     None     None     None     None
3      Yes     None     None     None     None     None
4       No A little     None     Some     None     None
5      Yes      All     None      All A little     None
6      Yes     None     None     None     None     None

至于 output 无group = 1

ggplot(data = smokement) +
+ geom_bar(aes(x = smoke100, y = ..prop.., fill = smoke100)) +
+ facet_wrap(~misnervs, nrow = 1)

无组码输出

Answer 1

通过调整对geom_bar * 的调用，我无法获得您想要的东西，但我认为这可以满足您的需求。 由于您没有提供输入数据集（出于可以理解的原因），我在代码中使用了diamonds tibble。 您需要进行的更改应该是显而易见的。

*：我确信它可以完成：我只是无法解决它。

我的解决方案背后的想法是在调用 ggplot 之前预先计算您想要ggplot的比例。

group_modify采用分组的 tibble 并将指定的 function 依次应用于每个组，然后返回修改后的（分组的）tibble。

diamonds %>% 
  group_by(cut) %>% 
    group_modify(
      function(.x, .y) 
        .x %>% 
        group_by(color) %>% 
        summarise(Prop=n()/nrow(.))
    ) %>% 
    ggplot() +
      geom_col(aes(x=color, y=Prop, fill=color)) +
      facet_wrap(~cut)

注意从geom_bar到geom_col的切换： geom_bar使用行数， geom_col使用数据中的值。

作为一个粗略的 QC，这是产生“全灰色”plot 的代码的等价物：

diamonds %>% 
  ggplot() +
    geom_bar(aes(x=color, y=..prop.., fill=color, group=1)) +
    facet_wrap(~cut)

Answer 2

除了这里提供的解决方案之外， GGAlly package 还包括一个stat_prop ，它引入了一个新by美学来指定计算比例的方式：

library(GGally)

ggplot(data = smokement) + 
  geom_bar(aes(x = smoke100, y = ..prop.., fill = smoke100, by = misnervs), stat = "prop") + 
  facet_wrap(~misnervs, nrow = 1)

仅供参考，没有GGAlly也可以通过设置fill=factor(..x..)来实现：

ggplot(data = smokement) + 
  geom_bar(aes(x = smoke100, y = ..prop.., fill = factor(..x..), group = 1)) + 
  facet_wrap(~misnervs, nrow = 1)

数据

misnervs <- c("All", "Most", "Some", "A little", "None")

set.seed(123)

smokement <- 
  data.frame(
    smoke100 = sample(c("Yes", "No"), 100, replace = TRUE),
    misnervs = factor(sample(misnervs, 100, replace = TRUE), levels = misnervs)
  )

在 geom_bar 中同时使用填充和组参数时 ggplot2 出错

问题描述

2 个解决方案

解决方案1
0 2021-11-21 08:05:53

解决方案2
0 2021-11-21 08:49:48

在 geom_bar 中同时使用填充和组参数时 ggplot2 出错

问题描述

2 个解决方案

解决方案1 0 2021-11-21 08:05:53

解决方案2 0 2021-11-21 08:49:48

解决方案1
0 2021-11-21 08:05:53

解决方案2
0 2021-11-21 08:49:48