簡體   English   中英

在R中使用ggplot2在堆積條形圖上疊加線條

[英]Overlay lines on stacked bar chart using ggplot2 in R

我正在嘗試從Tuominen-Soini等人那里制作一個如下的數字 (2012)R使用ggplot2

Tuominen-Soini等。 (2012)情節

我有一個data.framebars_df和四個變量(數據在問題的結尾):

> str(bars_df)
'data.frame':   18 obs. of  4 variables:
 $ key : chr  "time_2" "time_2" "time_2" "time_2" ...
 $ val : Factor w/ 6 levels "0","1","2","3",..: 1 2 3 4 5 6 1 2 3 4 ...
 $ sum : num  0 147 144 63 512 30 0 100 302 168 ...
 $ prop: num  0 0.164 0.161 0.07 0.571 0.033 0 0.098 0.297 0.165 ...

使用bars_df ,我使用以下方法制作了條形圖:

library(ggplot2)
ggplot(bars_df, aes(x = key, y = prop, fill = val)) +
    geom_col(position = 'stack')

條形圖測試圖

另外,通過從一個代碼創建用於個人頻數分布表到另一(或同一代碼)之間time_1time_2之間time_2time_3 ,以及評估其轉移( shift_1time_1time_2 ; shift_2time_1time_2 )為更有可能比一個偶然的機會(與預期表示+ ),比機會不太可能(標注-我做了如下data.frame (數據又是在最后):

> str(lines_df)
'data.frame':   72 obs. of  3 variables:
 $ code : chr  "0-0" "0-1" "0-2" "0-3" ...
 $ shift: chr  "shift_1" "shift_1" "shift_1" "shift_1" ...
 $ sig  : chr  "+" NA NA NA ...

例如,在第一行中,“ 0-0”表示從time_1time_2 from code 0 to code 0的偏移(實際上,不是偏移) . So, individuals with a . So, individuals with a 0 code at TIME_1 are likely to remain with a 0 at time_2 . I'd like to add lines two different . I'd like to add lines two different線型, one each for線型, one each for + and -`移位,如上圖所示。

雖然有一些問題的示例說明了如何在條形圖上疊加一條線 ,但在這種情況下,我看不到如何將這兩個data.frame組合在一起。 考慮到條形和線條的這種特定配置,這似乎很困難。

bars_df數據:

bars_df <- structure(list(key = c("time_2", "time_2", "time_2", "time_2", 
"time_2", "time_2", "time_1", "time_1", "time_1", "time_1", "time_1", 
"time_1", "time_3", "time_3", "time_3", "time_3", "time_3", "time_3"
), val = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("0", "1", "2", "3", 
"4", "5"), class = "factor"), sum = c(0, 147, 144, 63, 512, 30, 
0, 100, 302, 168, 412, 35, 0, 51, 56, 84, 252, 20), prop = c(0, 
0.164, 0.161, 0.07, 0.571, 0.033, 0, 0.098, 0.297, 0.165, 0.405, 
0.034, 0, 0.11, 0.121, 0.181, 0.544, 0.043)), .Names = c("key", 
"val", "sum", "prop"), row.names = c(NA, -18L), class = "data.frame")

lines_df數據:

lines_df <- structure(list(code = c("0-0", "0-1", "0-2", "0-3", "0-4", "0-5", 
"1-0", "1-1", "1-2", "1-3", "1-4", "1-5", "2-0", "2-1", "2-2", 
"2-3", "2-4", "2-5", "3-0", "3-1", "3-2", "3-3", "3-4", "3-5", 
"4-0", "4-1", "4-2", "4-3", "4-4", "4-5", "5-0", "5-1", "5-2", 
"5-3", "5-4", "5-5", "0-0", "0-1", "0-2", "0-3", "0-4", "0-5", 
"1-0", "1-1", "1-2", "1-3", "1-4", "1-5", "2-0", "2-1", "2-2", 
"2-3", "2-4", "2-5", "3-0", "3-1", "3-2", "3-3", "3-4", "3-5", 
"4-0", "4-1", "4-2", "4-3", "4-4", "4-5", "5-0", "5-1", "5-2", 
"5-3", "5-4", "5-5"), shift = c("shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_1", "shift_1", "shift_1", 
"shift_1", "shift_1", "shift_1", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2", "shift_2", "shift_2", "shift_2", 
"shift_2", "shift_2", "shift_2"), sig = c("+", NA, NA, NA, NA, 
NA, NA, NA, "-", "-", NA, NA, NA, NA, "+", NA, NA, NA, NA, NA, 
NA, "+", "-", NA, NA, "-", NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, "+", NA, "+", "+", NA, NA, "-", 
"-", NA, NA, NA, NA, NA, NA, "+", NA, NA, NA, NA, NA, NA, "+", 
NA, NA, NA, NA, NA, NA, NA)), .Names = c("code", "shift", "sig"
), row.names = c(NA, -72L), class = "data.frame")

我喜歡在沒人看的時候寫丑陋的代碼。

library(dplyr)
library(ggplot2)

d <- arrange(bars_df, key, val) %>%
  group_by(key) %>%
  mutate(prop_start = lag(cumsum(prop)), prop_end = prop_start + prop,
         midpoint = (prop_start + prop_end) / 2,
         next_key = paste("time", 1 + gsub("\\D", "", key) %>%
                            as.integer, sep = "_")) %>%
  mutate(next_key = ifelse(next_key %in% unique(d$key), next_key, NA))

e <- select(d, key, midpoint) %>%
  ungroup %>%
  mutate(key = paste("time", -1 + gsub("\\D", "", key) %>%
                 as.integer, sep = "_")) %>%
  rename(midpoint_end = midpoint) %>%
  filter(key %in% unique(d$key))

e <- full_join(d, e) %>%
  filter(!is.na(midpoint_end)) %>%
  group_by(key, val) %>%
  mutate(next_val = 1:n(),
         code = paste(val, next_val, sep = "-")) %>%
  left_join(lines_df) %>%
  filter(!is.na(sig))

ggplot(d,
       aes(x = key, xend = key, y = prop_start, yend = prop_end)) +
  geom_segment(aes(color = val), size = 10) +
  geom_segment(data = e,
               aes(x = key, xend = next_key,
                   y = midpoint, yend = midpoint_end,
                   lty = sig),
              arrow = arrow(length = unit(6, "pt")))

在此處輸入圖片說明

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM