R：如何在 ggplotly 中自定義 Sankey 情節？

Question

我有按年份和型號划分的銷售數據，通過Sankey chart可視化。 現在我正在努力處理 2 個問題：

首先，我需要將模型B始終設置在 chaty 的底部，無論它多年來的價值。
當我通過ggplotly重新可視化 ggplot 時，懸停不顯示銷售額或年份

代碼：

df <- data.frame (model  = c("A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J"),
 Year = c(2015,2015,2015,2015,2015,2015,2015,2015,2015,2015,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2018,2018,2018,2018,2018,2018,2018,2018,2018,2018,2019,2019,2019,2019,2019,2019,2019,2019,2019,2019,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020),
                  sales = c(450,678,456,344,984,456,234,244,655,789,234,567,234,567,232,900,1005,1900,450,345,567,235,456,345,144,333,555,777,111,444,222,223,445,776,331,788,980,1003,456,434,345,2222,3456,456,678,8911,4560,4567,4566,5555,6666,7777,8888,1233,1255,5677,3411,2344,6122,4533))

install.packages("remotes")
#remotes::install_github("davidsjoberg/ggsankey")
library(ggsankey)
library(tidyverse)

plot <- ggplot(df, aes(x = Year,
               node = model,
               fill = model,
               value = sales)) +
  geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
  scale_fill_viridis_d(option = "A", alpha = .8) +
  theme_sankey_bump(base_size = 16) 

  
  
ggplotly(plot)

Answer 1

我絕對肯定有更好的方法，但我花了一段時間才讓它工作。 我想這就是你要找的。

我從您在這里擁有的ggplot和ggplotly對象開始。 此初始繪圖的主要目的是捕捉顏色。 （我本可以用幾種不同的方式捕捉它們，但在你的情節中這已經為我完成了。）

更新我已經修改了你要求的兩個元素**

library(ggsankey)
library(tidyverse)
library(plotly)

# df from the question is unchanged

# visualize the original
(plot <- ggplot(df, 
              aes(Year, node = model, fill = model, value = sales)) + 
    geom_sankey_bump(space = 0, type = "alluvial", 
                     color = "transparent", smooth = 15) +
    scale_fill_viridis_d(option = "A", alpha = .8) +
    theme_sankey_bump(base_size = 16)) 

ggplotly(plot) -> plp 
plp

#-------- colors --------
# collect the 10 colors
cols <- map_dfr(1:10, function(k){
  nm <- plp$x$data[[k]]$name
  filler <- plp$x$data[[k]]$fillcolor
  c(nm = nm, filler = filler)
})

然后我將 B 模型的內容分成 10 組，以確保它始終是最小的凹凸。 這使我能夠收集所有其他模型的堆疊值，這是將 B 推到底部所必需的。

#-------------- splitting B -------------
df1 <- df %>% filter(model != "B") %>% 
  arrange(Year, sales)

df2 <- df %>% filter(model == "B") %>% # this gets used further down
  arrange(Year)

# split B into 10 groups - keep on the bottom, then join the groups
# make the groups
ng <- vector(length = 10)
invisible(
  map(1:10,
      function(i) {
        ng[i] <<- rep("B", i) %>% paste0(collapse = "")
      })
)
# add values for these groups by year
df4 <- data.frame(Year = rep(unique(df$Year), each = 10),
                  model = rep(ng, length(unique(df$Year))),
                  sales = rep(df2$sales/10, each = 10))
df5 <- rbind(df1, df4)

使用模型 B 的 10 個子部分重新創建 Sankey 凹凸。接下來的所有內容都適用於該圖。

#-------------- plotly after dividing B -------------
(nplt <- ggplot(df5, aes(x = Year, node = model, fill = model, value = sales)) +
   geom_sankey_bump(space = 0, type = "alluvial", 
                    color = "transparent", smooth = 15) +
   scale_fill_viridis_d(option = "A", alpha = .8) +
   theme_sankey_bump(base_size = 16))

ggplotly(nplt) -> plt
plt

使用 JUST B 創建一個 Sankey 凹凸，以在底部捕獲代表模型 B 的數據。 使用此數據替換對象plt中表示 B 的所有跡線。 顏色也在這里固定。 （第一個圖中的原始 10 種顏色。）最后， hoverinfo被刪除。 這將在接下來得到解決。

#-------------- get values for B at the bottom -------------
df %>% filter(model == "B") %>% 
  ggplot(aes(x = Year,
             node = model,
             fill = model,
             value = sales)) +
  geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
  scale_fill_viridis_d(option = "A", alpha = .8) +
  theme_sankey_bump(base_size = 16) -> bplt
ggplotly(bplt) -> bplotly
bplotly

#------- take divided B and remove all but one trace for B --------
# xx <- plt$x$data
# plt$x$data <- xx[c(1:2, 12:length(xx))] # keep only one B trace

#---------------- adjustments to plt's build --------------------
# change out data for the B trace, add the right colors
wh <- vector(length = 0)
invisible(
  map(1:length(plt$x$data),
      function(j) {
        nm <- plt$x$data[[j]]$name
        plt$x$data[[j]]$hoverinfo <<- "none"
        plt$x$data[[j]]$fillcolor <<- unlist(cols[cols$nm == nm, "filler"], 
                                             use.names = F)
        if(str_detect(nm, "^B$")){
          plt$x$data[[j]]$x <<- bplotly$x$data[[1]]$x
          plt$x$data[[j]]$y <<- bplotly$x$data[[1]]$y
        }
        if(str_detect(nm, "BB")) {
          wh[length(wh) + 1] <<- j # list of unnecessary traces (extra B groups)
        }
      })
)

#----- take divided B and remove all but one trace for B ------
plt$x$data <- plt$x$data[-c(wh)]         # <------ forget this line when updated last time

# visualize Sankey bump with B at the bottom
plt

Plotly 對象基本上是 10 色球，背景中年份之間沒有分隔。 因此，如果您按原樣為此添加工具提示，則只能有一個...

為了獲得您正在尋找的工具提示，我創建了另一個跟蹤（實際上是 10 個，每個模型 1 個）。 為了獲得正確的值（因為銷售數據不在 50K 范圍內），我使用plt中的數據創建了一個新的數據框。

#--------------- collect values for hovertext positions ----------
x <- plt$x$data[[1]]$x
inds <- which(x %in% 2015:2020, arr.ind = T)
yrs <- x[inds]

tellMe <- invisible(
  map(1:length(plt$x$data),
      function(m) {
        y <- plt$x$data[[m]]$y
        y[inds]
      }) %>% setNames(sort(unique(df$model))) %>% # changed from LETTERS[1:10] 
    as.data.frame() %>% 
    mutate(yr = yrs %>% as.integer()) %>% 
    pivot_longer(names_to = "model", values_to = "sales", 
                 cols = sort(unique(df$model))) %>% 
    distinct() %>% 
    group_by(yr, model) %>% 
    summarise(val = mean(sales)) %>% 
    left_join(df, by = c("yr" = "Year", "model" = "model")) %>% 
    as.data.frame() # drop groups
)

#-------------- create data trace for hovertext --------------
plot_ly(tellMe, x = ~yr, y = ~val, split = ~model, 
        customdata = ~sales, text = ~model,
        line = list(width = .01, shape = "spline", smoothing = 1.3),
        hovertemplate = "Year: %{x}<br>Model: %{text}<br>Sales: %{customdata}<extra></extra>",
        type = "scatter", mode = "lines", showlegend = F) -> pp2
pp2

如果你看這里的情節，它看起來是空白的。 那是因為線條有多小。 這是故意的。 您不希望圖表上出現線條。

修復顏色，使hoverlabel背景顏色與圖例顏色匹配。

# change colors to match sankey
pp2 <- plotly_build(pp2)
invisible(
  map(1:10,
      function(z) {
        nm <- pp2$x$data[[z]]$name
        # collect and assign the color
        cr <- unlist(cols[cols$nm == nm, "filler"], use.names = F)
        pp2$x$data[[z]]$line$color <<- cr
      })
)

在這里使用subplot不起作用。 當我嘗試添加跟蹤時，Plotly 給了我一個錯誤，無論是一次全部還是每個模型一個。 所以我把痕跡強行放在一起。

#-------------- consolidate the traces (subplot won't work) -----------
# collect data one more time!
dx <- plt$x$data
yx <- pp2$x$data
yx <- append(yx, dx) # put plt on top

# replace data 
plt$x$data <- yx

# lines are small, increase the distance searched for matches
plt %>% layout(hoverdistance = 40)

最終產品：

R：如何在 ggplotly 中自定義 Sankey 情節？

問題描述

1 個解決方案

解決方案1
1 已采納 2022-07-06 16:14:15

更新我已經修改了你要求的兩個元素**

R：如何在 ggplotly 中自定義 Sankey 情節？

問題描述

1 個解決方案

解決方案1 1 已采納 2022-07-06 16:14:15

更新**我已經修改了你要求的兩個元素

解決方案1
1 已采納 2022-07-06 16:14:15

更新我已經修改了你要求的兩個元素**