简体   繁体   English

Plot ggplot2 中条形图上的一条线

[英]Plot a line on a barchart in ggplot2

I have built a stacked bar chart showing the relative proportions of response to different questions.我制作了一个堆积条形图,显示对不同问题的回答的相对比例。 Now I want to show a particular response ontop of that barchart, to show how an individuals response relates to the overall proportions of responses.现在我想在该条形图的顶部显示一个特定的响应,以显示个人响应与响应的总体比例之间的关系。

I created a toy example here:我在这里创建了一个玩具示例:

library(ggplot2)
n = 1000
n_groups = 5
overall_df = data.frame(
  state = sample(letters[1:8], n, replace = TRUE),
  frequency = runif(n, min = 0, max = 1),
  var_id = rep(LETTERS[1:n_groups], each = 1000 / n_groups)
)

row = data.frame(
  A = "a", B = "b", C = "c", D = "h", E = "b"
)

ggplot(overall_df, 
           aes(fill=state, y=frequency, x=var_id)) + 
  geom_bar(position="fill", stat="identity") 

The goal here is to have the responses in the object row plotted as a point in the corresponding barchart box, with a line connecting the points.此处的目标是将 object row中的响应绘制为相应条形图框中的一个点,并用一条线连接这些点。

Here is a (poorly drawn) example of the desired result.这是所需结果的(绘制不佳的)示例。 Thanks for your help.谢谢你的帮助。

在此处输入图像描述

This was trickier than I thought.这比我想象的要棘手。 I'm not sure there's any way round manually calculating the x/y co-ordinates of the line.我不确定是否有任何方法可以手动计算直线的 x/y 坐标。

library(dplyr)
library(ggplot2)

df <- overall_df %>% group_by(state, var_id) %>%
  summarize(frequency = sum(frequency))

freq <- unlist(Map(function(d, val) {
  (sum(d$frequency[d$state > val]) + 0.5 * d$frequency[d$state == val]) /
    sum(d$frequency)
  }, d = split(df, df$var_id), val = row))
  
line_df <- data.frame(state = unlist(row),
                      frequency = freq,
                      var_id = names(row))

ggplot(df, aes(fill=state, y=frequency, x=var_id)) + 
  geom_col(position="fill") +
  geom_line(data = line_df, aes(group = 1)) +
  geom_point(data = line_df, aes(group = 1))

Created on 2022-03-08 by the reprex package (v2.0.1)reprex package (v2.0.1) 创建于 2022-03-08

Here's an automated approach using dplyr. I prepare the summary by joining the label data to the original data, and then using group_by + summarize to get those.这是使用 dplyr 的自动化方法。我通过将 label 数据加入原始数据来准备摘要,然后使用group_by + summarize来获取这些数据。

library(dplyr)
row_df <- data.frame(state = letters[1:n_groups], var_id = LETTERS[1:n_groups])

line_df <- row_df %>%
  left_join(overall_df, by = "var_id") %>%
  group_by(var_id) %>%
  summarize(state = last(state.x),
            frequency = (sum(frequency[state.x < state.y]) + 
                         sum(frequency[state.x == state.y])/2) / sum(frequency))

ggplot(overall_df, aes(fill=state, y=frequency, x=var_id)) + 
  geom_bar(position="fill", stat="identity") +
  geom_point(data = line_df) +
  geom_line(data = line_df, aes(group = 1))

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM