[英]How to draw network diagram from data frame columns in R?
I have a data frame of customers.我有一个客户数据框。 I want to draw a customer stages as network diagram.我想将客户阶段绘制为网络图。 Sample data is like below.示例数据如下所示。
cust_id checkin time stage2 stage3 checkout time
12345 2019-01-01 07:02:50 2019-01-01 07:23:25 2019-01-01 07:23:22 2019-01-01 08:37:43
56789 2019-01-01 07:25:21 2019-01-01 07:35:29 2019-01-01 07:35:27 2019-01-01 09:36:06
43256 2019-01-01 07:27:22 2019-01-01 07:42:49 NA 2019-01-01 09:34:55
34567 2019-01-01 07:22:15 2019-01-01 08:25:35 2019-01-01 07:26:02 2019-01-01 09:00:40
89765 2019-01-01 08:29:35 2019-01-01 08:30:58 NA 2019-01-01 09:02:48
23456 2019-01-01 08:54:12 2019-01-01 09:18:46 2019-01-01 09:08:34 2019-01-01 09:46:38
The raw data is look like above.原始数据如上所示。 There is no rule for customer ie, Some of the customers checkout after stage2 and some of the customers has to go stage 3 and checkout after stage 3.客户没有规则,即一些客户在第 2 阶段之后结账,而一些客户必须在第 3 阶段和第 3 阶段之后结帐。
Basically, I want to draw network map of the cusomers stages like below:基本上,我想绘制如下客户阶段的网络 map:
checkin > stage2 > stage3 > checkout
|
checkout
How to do that in R?如何在 R 中做到这一点?
Tried like below with networkD3 package:使用networkD3 package尝试如下:
library(igraph)
library(networkD3)
p <- simpleNetwork(df, height="100px", width="100px",
Source = 1, # column number of source
Target = 5, # column number of target
linkDistance = 10, # distance between node. Increase this value to have more space between nodes
charge = -900, # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
fontSize = 14, # size of the node names
fontFamily = "serif", # font og node names
linkColour = "#666", # colour of edges, MUST be a common colour for the whole graph
nodeColour = "#69b3a2", # colour of nodes, MUST be a common colour for the whole graph
opacity = 0.9, # opacity of nodes. 0=transparent. 1=no transparency
zoom = T # Can you zoom on the figure?
)
p
Please, help me to find the way to it.请帮我找出路。
I've found the DiagrammeR
package useful.我发现DiagrammeR
package 很有用。 Converting your sample data to the formats used by Diagrammer would be awkward, so I've done it manually.将您的示例数据转换为 Diagrammer 使用的格式会很尴尬,所以我手动完成了。
library(DiagrammeR)
# Manually represent your data as nodes and edges
nodes <- create_node_df(n=5, label=c("Check in", "Stage 1", "Stage 2", "Stage 3", "Check out"))
edges <- create_edge_df(from = c(1, 2, 3), to = c(2, 3, 4))
lastStage <- c(4, 4, 3, 4, 3, 3)
# Create the base graph
graph <- create_graph(nodes_df=nodes, edges_df=edges)
# Produce the customer graphs
networks <- lapply(lastStage, function(x) graph %>% add_edge(from=x, to=5) %>% render_graph())
networks[[2]]
Giving, as an example,举个例子,
You have considerable control over the appearance of the graph.您对图表的外观有相当大的控制权。 The DiagrammeR home page is here . DiagrammeR 主页在这里。
here's one solution using networkD3
...这是使用networkD3
的一种解决方案...
library(tidyverse)
library(lubridate)
library(networkD3)
data <-
tribble(
~cust_id, ~checkin.time, ~stage2, ~stage3, ~checkout.time,
12345, "2019-01-01 07:02:50", "2019-01-01 07:23:25", "2019-01-01 07:23:22", "2019-01-01 08:37:43",
56789, "2019-01-01 07:25:21", "2019-01-01 07:35:29", "2019-01-01 07:35:27", "2019-01-01 09:36:06",
43256, "2019-01-01 07:27:22", "2019-01-01 07:42:49", NA, "2019-01-01 09:34:55",
34567, "2019-01-01 07:22:15", "2019-01-01 08:25:35", "2019-01-01 07:26:02", "2019-01-01 09:00:40",
89765, "2019-01-01 08:29:35", "2019-01-01 08:30:58", NA, "2019-01-01 09:02:48",
23456, "2019-01-01 08:54:12", "2019-01-01 09:18:46", "2019-01-01 09:08:34", "2019-01-01 09:46:38"
) %>%
mutate(across(!cust_id, ~ymd_hms(.x, tz = "UTC")))
data %>%
select(-cust_id) %>%
mutate(across(.fns = ~if_else(is.na(.x), NA_character_, cur_column()))) %>%
mutate(row = row_number()) %>%
mutate(origin = .[[1]]) %>%
gather("column", "source", -row, -origin) %>%
mutate(column = match(column, names(data))) %>%
filter(!is.na(source)) %>%
arrange(row, column) %>%
group_by(row) %>%
mutate(target = lead(source)) %>%
ungroup() %>%
filter(!is.na(source) & !is.na(target)) %>%
mutate(target = if_else(target == "checkout.time", paste0(target, " from ", source), target)) %>%
select(source, target, origin) %>%
group_by(source, target, origin) %>%
summarise(count = n()) %>%
ungroup() %>%
simpleNetwork()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.