简体   繁体   English

R、igraph、tidygraph 中的图学习

[英]Graph learning in R, igraph, tidygraph

I have a graph with each node having a value (value in red).我有一个图表,每个节点都有一个值(红色值)。

在此处输入图像描述

I would like to do the following two things (I guess 1 is a special case of 2):我想做以下两件事(我猜1是2的特例):

  1. Each node should be assigned the mean of the value of the direct peers directing to it.应该为每个节点分配指向它的直接对等点的值的平均值。 For example node #5 (1+2)/2=1.5 or node #3 (0+2+0)/3=2/3 .例如节点 #5 (1+2)/2=1.5或节点 #3 (0+2+0)/3=2/3

  2. Instead of direct neighbors, include all connected nodes but with a diffusion of times 1/n with n being the distance to the node.代替直接邻居,包括所有连接的节点,但扩散时间为 1/n,其中 n 是到节点的距离。 The further away the information is coming from the weaker signal we'd have.信息越远,我们所拥有的信号就越弱。

I looked into functions of igraph, but could not find anything that is doing this (I might have overseen though).我查看了 igraph 的功能,但找不到任何这样做的东西(虽然我可能已经监督了)。 How could I do this computation?我怎么能做这个计算?

Below is the code for a sample network with random values.下面是具有随机值的示例网络的代码。

library(tidyverse)
library(tidygraph)
library(ggraph)

set.seed(6)
q <- tidygraph::play_erdos_renyi(6, p = 0.2) %>% 
  mutate(id = row_number(),
         value = sample(0:3, size = 6, replace = T))
q %>% 
  ggraph(layout = "with_fr") +
  geom_edge_link(arrow = arrow(length = unit(0.2, "inches"), 
                               type = "closed")) +
  geom_node_label(aes(label = id)) +
  geom_node_text(aes(label = value), color = "red", size = 7, 
                 nudge_x = 0.2, nudge_y = 0.2)

Edit, found a solution to 1编辑,找到了1的解决方案

q %>% 
  mutate(value_smooth = map_local_dbl(order = 1, mindist = 1, mode = "in", 
                                      .f = function(neighborhood, ...) {
    mean(as_tibble(neighborhood, active = 'nodes')$value)
  }))

Edit 2, solution to 2, not the most elegant I guess编辑2,解决方案2,我猜不是最优雅的

q %>% 
  mutate(value_smooth = map_local_dbl(order = 1, mindist = 0, mode = "in", 
                                      .f = function(neighborhood, node, ...) {
    ne <- neighborhood
    
    ne <- ne %>%
      mutate(d = node_distance_to(which(as_tibble(ne, 
                                                  active = "nodes")$id == node)))
    
    as_tibble(ne, active = 'nodes') %>% 
      filter(d != 0) %>% 
      mutate(helper = value/d) %>% 
      summarise(m = mean(value)) %>% 
      pull(m)
    }))

Edit 3, a faster alternative to map_local_dbl编辑 3,一个更快的替代map_local_dbl

map_local loops through all nodes of the graph. map_local循环遍历图的所有节点。 For large graphs, this takes very long.对于大图,这需要很长时间。 For just computing the means, this is not needed.对于仅计算均值,这不是必需的。 A much faster alternative is to use the adjacency matrix and some matrix multiplication.一个更快的替代方法是使用邻接矩阵和一些矩阵乘法。

q_adj <- q %>% 
  igraph::as_adjacency_matrix()

# out
(q_adj %*% as_tibble(q)$value) / Matrix::rowSums(q_adj)

# in
(t(q_adj) %*% as_tibble(q)$value) / Matrix::colSums(q_adj)

The square of the adjacency matrix is the second order adjacency matrix, and so forth.邻接矩阵的平方是二阶邻接矩阵,以此类推。 So a solution to problem 2 could also be created.因此,也可以创建问题 2 的解决方案。

Edit 4, direct weighted mean编辑4,直接加权平均

Say the original graph has weights associated to each edge.假设原始图具有与每条边相关的权重。

q <- q %>% 
  activate(edges) %>% 
  mutate(w = c(1,0.5,1,0.5,1,0.5,1)) %>% 
  activate(nodes)

We would like to compute the weighted mean of the direct peers' value.我们想计算直接同行价值的加权平均值。

q_adj_wgt <- q %>% 
  igraph::as_adjacency_matrix(attr = "w")

# out
(q_adj_wgt %*% as_tibble(q)$value) / Matrix::rowSums(q_adj_wgt)

# in
(t(q_adj_wgt) %*% as_tibble(q)$value) / Matrix::colSums(q_adj_wgt)

Probably you can try the code below可能你可以试试下面的代码

q %>%
    set_vertex_attr(
        name = "value",
        value = sapply(
            ego(., mode = "in", mindist = 1),
            function(x) mean(x$value)
        )
    )

which gives这使

# A tbl_graph: 6 nodes and 7 edges
#
# A directed simple graph with 1 component
#
# Node Data: 6 x 2 (active)
     id   value
  <int>   <dbl>
1     1   0.5
2     2 NaN
3     3   0.667
4     4 NaN
5     5   1.5
6     6 NaN
#
# Edge Data: 7 x 2
   from    to
  <int> <int>
1     3     1
2     6     1
3     1     3
# ... with 4 more rows

Each node should be assigned the mean of the value of the direct peers directing to it.应该为每个节点分配指向它的直接对等点的值的平均值。

Guessing that you really mean猜你是真的意思

Each node should be assigned the mean of the values of the direct peers directing to it, before any node values were changed在更改任何节点值之前,应为每个节点分配指向它的直接对等点的值的平均值

This seems trivial - maybe I am missing something?这似乎微不足道 - 也许我错过了什么?

Loop over nodes
    Sum values of adjacent nodes
    Calculate mean and store in vector by node index
Loop over nodes
    Set node value to mean stored in previous loop

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM