繁体   English   中英

在R中使用ggplot2在堆积条形图上叠加线条

[英]Overlay lines on stacked bar chart using ggplot2 in R

我正在尝试从Tuominen-Soini等人那里制作一个如下的数字 (2012)R使用ggplot2

Tuominen-Soini等。 (2012)情节

我有一个data.framebars_df和四个变量(数据在问题的结尾):

> str(bars_df)
'data.frame':   18 obs. of  4 variables:
 $ key : chr  "time_2" "time_2" "time_2" "time_2" ...
 $ val : Factor w/ 6 levels "0","1","2","3",..: 1 2 3 4 5 6 1 2 3 4 ...
 $ sum : num  0 147 144 63 512 30 0 100 302 168 ...
 $ prop: num  0 0.164 0.161 0.07 0.571 0.033 0 0.098 0.297 0.165 ...

使用bars_df ,我使用以下方法制作了条形图:

library(ggplot2)
ggplot(bars_df, aes(x = key, y = prop, fill = val)) +
    geom_col(position = 'stack')

条形图测试图

另外,通过从一个代码创建用于个人频数分布表到另一(或同一代码)之间time_1time_2之间time_2time_3 ,以及评估其转移( shift_1time_1time_2 ; shift_2time_1time_2 )为更有可能比一个偶然的机会(与预期表示+ ),比机会不太可能(标注-我做了如下data.frame (数据又是在最后):

> str(lines_df)
'data.frame':   72 obs. of  3 variables:
 $ code : chr  "0-0" "0-1" "0-2" "0-3" ...
 $ shift: chr  "shift_1" "shift_1" "shift_1" "shift_1" ...
 $ sig  : chr  "+" NA NA NA ...

例如,在第一行中,“ 0-0”表示从time_1time_2 from code 0 to code 0的偏移(实际上,不是偏移) . So, individuals with a . So, individuals with a 0 code at TIME_1 are likely to remain with a 0 at time_2 . I'd like to add lines two different . I'd like to add lines two different线型, one each for线型, one each for + and -`移位,如上图所示。

虽然有一些问题的示例说明了如何在条形图上叠加一条线 ,但在这种情况下,我看不到如何将这两个data.frame组合在一起。 考虑到条形和线条的这种特定配置,这似乎很困难。

bars_df数据:

bars_df <- structure(list(key = c("time_2", "time_2", "time_2", "time_2", 
"time_2", "time_2", "time_1", "time_1", "time_1", "time_1", "time_1", 
"time_1", "time_3", "time_3", "time_3", "time_3", "time_3", "time_3"
), val = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("0", "1", "2", "3", 
"4", "5"), class = "factor"), sum = c(0, 147, 144, 63, 512, 30, 
0, 100, 302, 168, 412, 35, 0, 51, 56, 84, 252, 20), prop = c(0, 
0.164, 0.161, 0.07, 0.571, 0.033, 0, 0.098, 0.297, 0.165, 0.405, 
0.034, 0, 0.11, 0.121, 0.181, 0.544, 0.043)), .Names = c("key", 
"val", "sum", "prop"), row.names = c(NA, -18L), class = "data.frame")

lines_df数据:

lines_df <- structure(list(code = c("0-0", "0-1", "0-2", "0-3", "0-4", "0-5", 
"1-0", "1-1", "1-2", "1-3", "1-4", "1-5", "2-0", "2-1", "2-2", 
"2-3", "2-4", "2-5", "3-0", "3-1", "3-2", "3-3", "3-4", "3-5", 
"4-0", "4-1", "4-2", "4-3", "4-4", "4-5", "5-0", "5-1", "5-2", 
"5-3", "5-4", "5-5", "0-0", "0-1", "0-2", "0-3", "0-4", "0-5", 
"1-0", "1-1", "1-2", "1-3", "1-4", "1-5", "2-0", "2-1", "2-2", 
"2-3", "2-4", "2-5", "3-0", "3-1", "3-2", "3-3", "3-4", "3-5", 
"4-0", "4-1", "4-2", "4-3", "4-4", "4-5", "5-0", "5-1", "5-2", 
"5-3", "5-4", "5-5"), shift = c("shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2"), sig = c("+", NA, NA, NA, NA, 
NA, NA, NA, "-", "-", NA, NA, NA, NA, "+", NA, NA, NA, NA, NA, 
NA, "+", "-", NA, NA, "-", NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, "+", NA, "+", "+", NA, NA, "-", 
"-", NA, NA, NA, NA, NA, NA, "+", NA, NA, NA, NA, NA, NA, "+", 
NA, NA, NA, NA, NA, NA, NA)), .Names = c("code", "shift", "sig"
), row.names = c(NA, -72L), class = "data.frame")

我喜欢在没人看的时候写丑陋的代码。

library(dplyr)
library(ggplot2)

d <- arrange(bars_df, key, val) %>%
  group_by(key) %>%
  mutate(prop_start = lag(cumsum(prop)), prop_end = prop_start + prop,
         midpoint = (prop_start + prop_end) / 2,
         next_key = paste("time", 1 + gsub("\\D", "", key) %>%
                            as.integer, sep = "_")) %>%
  mutate(next_key = ifelse(next_key %in% unique(d$key), next_key, NA))

e <- select(d, key, midpoint) %>%
  ungroup %>%
  mutate(key = paste("time", -1 + gsub("\\D", "", key) %>%
                 as.integer, sep = "_")) %>%
  rename(midpoint_end = midpoint) %>%
  filter(key %in% unique(d$key))

e <- full_join(d, e) %>%
  filter(!is.na(midpoint_end)) %>%
  group_by(key, val) %>%
  mutate(next_val = 1:n(),
         code = paste(val, next_val, sep = "-")) %>%
  left_join(lines_df) %>%
  filter(!is.na(sig))

ggplot(d,
       aes(x = key, xend = key, y = prop_start, yend = prop_end)) +
  geom_segment(aes(color = val), size = 10) +
  geom_segment(data = e,
               aes(x = key, xend = next_key,
                   y = midpoint, yend = midpoint_end,
                   lty = sig),
              arrow = arrow(length = unit(6, "pt")))

在此处输入图片说明

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM