简体   繁体   English

通过使用igraph(R)组合入射顶点的属性来创建边缘属性

[英]Creating edge attributes by combining attributes of incident vertices using igraph (R)

For each edge in a graph I would like to add an numeric attribute (weight) that is the product of an attribute (probability) of the incident vertices. 对于图中的每个边,我想添加一个数值属性(权重),它是事件顶点的属性(概率)的乘积。 I can do it by looping over the edges; 我可以通过循环边缘来做到这一点; that is: 那是:

    for (i in E(G)) {
      ind <- V(G)[inc(i)]
      p <- get.vertex.attribute(G, name = "prob", index=ind)
      E(G)[i]$weight <- prod(p)
    }

However, this is qute slow for my graph (|V| ~= 20,000 and |E| ~= 200,000). 但是,这对于我的图表来说速度很慢(| V |〜= 20,000和| E |〜= 200,000)。 Is there a faster way to do this operation? 有没有更快的方法来执行此操作?

Here is probably the fastest solution. 这可能是最快的解决方案。 The key is to vectorize. 关键是矢量化。

library(igraph)
G <- graph.full(45)
set.seed(1)
V(G)$prob <- pnorm(vcount(G))

## Original solution
system.time(
  for (i in E(G)) {
    ind <- V(G)[inc(i)]
    p <- get.vertex.attribute(G, name = "prob", index=ind)
    E(G)[i]$wt.1 <- prod(p)
  }
)
#>    user  system elapsed 
#>   1.776   0.011   1.787 

## sapply solution
system.time(
  E(G)$wt.2 <- sapply(E(G), function(e) prod(V(G)[inc(e)]$prob))
)
#>    user  system elapsed 
#>   1.275   0.003   1.279 

## vectorized solution 
system.time({
  el <- get.edgelist(G)
  E(G)$wt.3 <- V(G)[el[, 1]]$prob * V(G)[el[, 2]]$prob
})
#>    user  system elapsed 
#>   0.003   0.000   0.003 

## are they the same?
identical(E(G)$wt.1, E(G)$wt.2)
#> [1] TRUE
identical(E(G)$wt.1, E(G)$wt.3)
#> [1] TRUE

The vectorized solution seems to be about 500 times faster, although more and better measurements would be needed to evaluate this more precisely. 矢量化解决方案似乎要快500倍,尽管需要更多更好的测量来更精确地评​​估。

Converting my comment to an answer. 将我的评论转换为答案。

library(igraph)
# sample data  - you should have provided this!!!
G <- graph.full(10)
set.seed(1)
V(G)$prob <- pnorm(rnorm(10))
length(E(G))

# for-loop
for (i in E(G)) {
  ind <- V(G)[inc(i)]
  p <- get.vertex.attribute(G, name = "prob", index=ind)
  E(G)[i]$wt.1 <- prod(p)
}

# sapply
E(G)$wt.2 <- sapply(E(G),function(e) prod(V(G)[inc(e)]$prob))

# are they the same?
identical(E(G)$wt.1, E(G)$wt.2)

With just 10 vertices and 45 edges, sapply(...) is about 4 times faster; 只有10个顶点和45个边, sapply(...)大约快4倍; with 100 vertices and ~5,000 edges, it is about 6 times faster. 有100个顶点和~5,000个边缘,它快6倍左右。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM