简体   繁体   English

如何从R中的数据框列绘制网络图?

[英]How to draw network diagram from data frame columns in R?

I have a data frame of customers.我有一个客户数据框。 I want to draw a customer stages as network diagram.我想将客户阶段绘制为网络图。 Sample data is like below.示例数据如下所示。

cust_id     checkin time           stage2                     stage3              checkout time
12345   2019-01-01 07:02:50     2019-01-01 07:23:25        2019-01-01 07:23:22  2019-01-01 08:37:43
56789   2019-01-01 07:25:21     2019-01-01 07:35:29        2019-01-01 07:35:27  2019-01-01 09:36:06
43256   2019-01-01 07:27:22     2019-01-01 07:42:49        NA                   2019-01-01 09:34:55
34567   2019-01-01 07:22:15     2019-01-01 08:25:35        2019-01-01 07:26:02  2019-01-01 09:00:40
89765   2019-01-01 08:29:35     2019-01-01 08:30:58        NA                   2019-01-01 09:02:48
23456   2019-01-01 08:54:12     2019-01-01 09:18:46        2019-01-01 09:08:34  2019-01-01 09:46:38

The raw data is look like above.原始数据如上所示。 There is no rule for customer ie, Some of the customers checkout after stage2 and some of the customers has to go stage 3 and checkout after stage 3.客户没有规则,即一些客户在第 2 阶段之后结账,而一些客户必须在第 3 阶段和第 3 阶段之后结帐。

Basically, I want to draw network map of the cusomers stages like below:基本上,我想绘制如下客户阶段的网络 map:

checkin > stage2 > stage3 > checkout
             |
            checkout

How to do that in R?如何在 R 中做到这一点?
Tried like below with networkD3 package:使用networkD3 package尝试如下:

library(igraph)
library(networkD3)
p <- simpleNetwork(df, height="100px", width="100px",        
                   Source = 1,                 # column number of source
                   Target = 5,                 # column number of target
                   linkDistance = 10,          # distance between node. Increase this value to have more space between nodes
                   charge = -900,                # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
                   fontSize = 14,               # size of the node names
                   fontFamily = "serif",       # font og node names
                   linkColour = "#666",        # colour of edges, MUST be a common colour for the whole graph
                   nodeColour = "#69b3a2",     # colour of nodes, MUST be a common colour for the whole graph
                   opacity = 0.9,              # opacity of nodes. 0=transparent. 1=no transparency
                   zoom = T                    # Can you zoom on the figure?
)

p

Please, help me to find the way to it.请帮我找出路。

I've found the DiagrammeR package useful.我发现DiagrammeR package 很有用。 Converting your sample data to the formats used by Diagrammer would be awkward, so I've done it manually.将您的示例数据转换为 Diagrammer 使用的格式会很尴尬,所以我手动完成了。

library(DiagrammeR)

# Manually represent your data as nodes and edges
nodes <- create_node_df(n=5, label=c("Check in", "Stage 1", "Stage 2", "Stage 3", "Check out"))
edges <- create_edge_df(from = c(1, 2, 3), to = c(2, 3, 4))
lastStage <- c(4, 4, 3, 4, 3, 3)

# Create the base graph
graph <- create_graph(nodes_df=nodes, edges_df=edges) 

# Produce the customer graphs
networks <- lapply(lastStage, function(x) graph %>% add_edge(from=x, to=5) %>% render_graph())
networks[[2]]

Giving, as an example,举个例子,

图表输出

You have considerable control over the appearance of the graph.您对图表的外观有相当大的控制权。 The DiagrammeR home page is here . DiagrammeR 主页在这里

here's one solution using networkD3 ...这是使用networkD3的一种解决方案...

library(tidyverse)
library(lubridate)
library(networkD3)

data <- 
  tribble(
  ~cust_id, ~checkin.time,         ~stage2,               ~stage3,               ~checkout.time,
  12345,    "2019-01-01 07:02:50", "2019-01-01 07:23:25", "2019-01-01 07:23:22", "2019-01-01 08:37:43",
  56789,    "2019-01-01 07:25:21", "2019-01-01 07:35:29", "2019-01-01 07:35:27", "2019-01-01 09:36:06",
  43256,    "2019-01-01 07:27:22", "2019-01-01 07:42:49", NA,                    "2019-01-01 09:34:55",
  34567,    "2019-01-01 07:22:15", "2019-01-01 08:25:35", "2019-01-01 07:26:02", "2019-01-01 09:00:40",
  89765,    "2019-01-01 08:29:35", "2019-01-01 08:30:58", NA,                    "2019-01-01 09:02:48",
  23456,    "2019-01-01 08:54:12", "2019-01-01 09:18:46", "2019-01-01 09:08:34", "2019-01-01 09:46:38"
  ) %>% 
  mutate(across(!cust_id, ~ymd_hms(.x, tz = "UTC")))

data %>% 
  select(-cust_id) %>% 
  mutate(across(.fns = ~if_else(is.na(.x), NA_character_, cur_column()))) %>% 
  mutate(row = row_number()) %>%
  mutate(origin = .[[1]]) %>%
  gather("column", "source", -row, -origin) %>%
  mutate(column = match(column, names(data))) %>%
  filter(!is.na(source)) %>% 
  arrange(row, column) %>%
  group_by(row) %>%
  mutate(target = lead(source)) %>%
  ungroup() %>%
  filter(!is.na(source) & !is.na(target)) %>%
  mutate(target = if_else(target == "checkout.time", paste0(target, " from ", source), target)) %>% 
  select(source, target, origin) %>%
  group_by(source, target, origin) %>%
  summarise(count = n()) %>%
  ungroup() %>%
  simpleNetwork()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM