[英]set sankey flow.fill to last node
In my example sankey diagram, the flow.fill and flow.color are set by the previous node (eg, all "yellow" flows at time_0 have yellow fill).在我的示例桑基图中,flow.fill 和 flow.color 由前一个节点设置(例如,time_0 处的所有“黄色”流都具有黄色填充)。 I would like to color the flows by the final node.
我想通过最终节点为流程着色。 For instance, all flows going into "yellow" at time_1 (yellow-yellow, red-yellow) have yellow fill instead of what you see below (red-yellow is red).
例如,所有在 time_1 进入“黄色”的流(黄-黄、红-黄)都有黄色填充,而不是您在下面看到的(红-黄是红色)。
library(tidyverse)
library(ggsankey)
set.seed(2)
# standard sankey
df <- tibble(
id = seq(1:22168),
time_0 = c(rep("red", 13309), rep("yellow", 8699), rep("green", 160)),
time_1 = c(rep("red", 1110), rep("yellow", 3771), rep("green", 8428),
rep("red", 321), rep("yellow", 1940), rep("green", 6438),
rep("red", 4), rep("yellow", 26), rep("green", 130))
) %>%
{. ->> df2} %>%
mutate(across(starts_with("time"), factor,
levels = c("green", "yellow", "red")))
df_sankey <- df %>%
ggsankey::make_long(time_0, time_1)
df_sankey_t <- df_sankey %>%
dplyr::group_by(x, node)%>%
tally()
df_sankey <- df_sankey %>%
left_join(df_sankey_t, by = c("x", "node"))
ggplot(df_sankey,
aes(x = x, next_x = next_x,
node = node, next_node = next_node,
fill = factor(node),
label = paste0(node," n=", n))) +
geom_sankey(flow.alpha = 0.6, node.color = "gray30") +
geom_sankey_label(size = 3, color = "white", fill = "gray40") +
scale_fill_manual(values = c("green", "red", "yellow")) +
theme_sankey(base_size = 18) +
theme(legend.position = "none",
plot.title.position = "plot",
plot.title = element_text(face="bold", size=20),
plot.subtitle = element_text(size=15)) +
labs(title = "Example sankey diagram",
subtitle = "Would like to color flow.fill and flow.color to be based on last node",
x = NULL)
You can get part of the way using PantaRhei
and all the way with inputs from grid
您可以使用
PantaRhei
获得部分方法,并使用来自grid
输入获得全部方法
library(PantaRhei)
library(dplyr)
library(tibble)
df1 <- tibble(
id = seq(1:22168),
time_0 = c(rep("red", 13309), rep("yellow", 8699), rep("green", 160)),
time_1 = c(rep("red", 1110), rep("yellow", 3771), rep("green", 8428),
rep("red", 321), rep("yellow", 1940), rep("green", 6438),
rep("red", 4), rep("yellow", 26), rep("green", 130))
) |>
mutate(across(starts_with("time"), factor,
levels = c("green", "yellow", "red")))
# summarise data for flows
# the heading names are specific to Panta Rhei for processing the data
# Panta Rhei documentation uses the 'substance' variable to name the 'substance' or name
# of the flow, in this case we'll use it to determine the fill.
# There may be more efficient ways to define unique 'from' and 'to' variables depending on your data.
flows <-
df1 |>
group_by(time_0, time_1) |>
summarise(quantity = n(), .groups = "drop") |>
mutate(substance = time_1,
from = case_when(time_0 == "yellow" ~ "B",
time_0 == "red" ~ "A",
time_0 == "green" ~ "C"),
to = case_when(time_1 == "yellow" ~ "D",
time_1 == "red" ~ "E",
time_1 == "green" ~ "F"))
# build up a nodes data frame
# to set labels and position of nodes
nodes <-
data.frame(ID = c(unique(flows$from), unique(flows$to)),
label = c(unique(flows$from), unique(flows$to)),
x = c(rep(1, 3), rep(2, 3)),
y = c("1", "1.25", "1.5", "C", "B", "A"),
label_pos = rep(c("left", "right"), each = 3))
colors <- tribble(
~substance, ~color,
"yellow", "yellow",
"red", "red",
"green", "green"
)
sankey(nodes, flows, colors, legend = FALSE)
# PantaRhei limitations
# Although you could change the colour of the nodes, there does not seem to be a way colour nodes individually
# While individual nodes are labelled I could not find a way to label the node columns (not sure of the correct term).
# Unable to control the formatting for node quantity.
# Not sure how to control the order of the 'to' nodes.
# As PantaRhei is build from grid you could probably add these features to your model.
# 'grid' based edits noted below:
library(grid)
# functions to inspect the grid tree.
# grid.force()
# grid.ls()
# grid.ls(grobs = FALSE, viewports = TRUE)
# make edits following inspection and a bit of trial and error...
grid.edit("GRID.polygon.38", gp = gpar(fill="red"))
grid.edit("GRID.polygon.33", gp = gpar(fill="yellow"))
grid.edit("GRID.polygon.29", gp = gpar(fill="green"))
grid.edit("GRID.polygon.17", gp = gpar(fill="red"))
grid.edit("GRID.polygon.21", gp = gpar(fill="yellow"))
grid.edit("GRID.polygon.25", gp = gpar(fill="green"))
# Get back to the root viewport
popViewport()
# add labels to node columns
grid.text(label = c("time_0", "time_1"), x = c(0.25, 0.75), y = rep(0.15, 2))
# you could also edit the node labels
# I've just edited one as an example
grid.edit("GRID.text.39", label = "Red")
# There may be more efficient ways to achieve the effect you desire.
# I'm still getting to grips with grid...!
Created on 2023-06-03 with reprex v2.0.2创建于 2023-06-03,使用reprex v2.0.2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.