简体   繁体   中英

sna R package efficiency function inconsistency

I have three different matrix and their Krackhardt efficiency seem wrong to me. The first two matrices are topologically equivalent, but their efficiency is different. Anyone has an explanation of the inconsistency?

For the first matrix, efficiency is 1:

A <- matrix(c(0,1,0,0,0,0,0,0,0,0,
          0,0,1,0,0,0,0,0,0,0,
          0,0,0,1,0,0,0,0,0,0,
          0,0,0,0,1,0,0,0,0,0,
          0,0,0,0,0,1,0,0,0,1,
          0,0,0,0,0,0,1,0,0,0,
          0,0,0,0,0,0,0,1,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,1,0),ncol=10)
A_net <- network(A,directed=TRUE)
g_eff <- efficiency(A_net)
plot.network(A_net, vertex.col = "white", vertex.border = col_ama, 
         usearrows=FALSE, edge.col=col_gri, vertex.lwd = 3.5,
         vertex.cex = 3.5)
title(paste("Efficiency =",round(g_eff,3)))

Equivalent matrix with different efficiency:

A <- matrix(c(0,0,0,0,1,0,0,0,0,0,
          1,0,1,0,0,0,0,0,0,0,  
          0,0,0,0,0,0,1,0,0,0,
          0,0,1,0,0,0,0,0,0,0,
          0,0,0,0,0,1,0,0,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,1,0,
          0,0,0,1,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,0,1,
          0,0,0,0,0,0,0,0,0,0),ncol=10)
A_net <- network(A,directed=FALSE)
g_eff <- efficiency(A_net)
plot.network(A_net, vertex.col = "white", vertex.border = col_ama, 
         usearrows=FALSE, edge.col=col_gri, vertex.lwd = 3.5,
         vertex.cex = 3.5)
title(paste("Efficiency =",round(g_eff,3)))

This third matrix has two components with the minimum number of edges (n_i-1), but their efficiency is not one. It doesn't match neither the formula in the help:

(1 - [ |E(G)| - Sum(N_i-1,i=1,..,n) ]/[ Sum((N_i-1)^2,i=1,..,n) ]   = 1-[8-(2+4)]/[4+16] = .9)

Third matrix:

A <- matrix(c(0,0,0,0,1,0,0,0,0,0,
          1,0,0,0,0,0,0,0,0,0,  
          0,0,0,0,0,0,1,0,0,0,
          0,0,1,0,0,0,0,0,0,0,
          0,0,0,0,0,1,0,0,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,1,0,
          0,0,0,1,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,0,1,
          0,0,0,0,0,0,0,0,0,0),ncol=10)
A_net <- network(A,directed=FALSE)
g_eff <- efficiency(A_net)
g_eff
plot.network(A_net, vertex.col = "white", vertex.border = col_ama, 
         usearrows=FALSE, edge.col=col_gri, vertex.lwd = 3.5,
         vertex.cex = 3.5)
title(paste("Efficiency =",round(g_eff,3)))

It appears that your example is in error: in the first case, you are working with a directed network, and in the second an undirected network (check your network coercion statement - BTW, you can just use matrices). Here is a demonstration that the first two are in fact equivalent:

> A <- matrix(c(0,1,0,0,0,0,0,0,0,0,
+           0,0,1,0,0,0,0,0,0,0,
+           0,0,0,1,0,0,0,0,0,0,
+           0,0,0,0,1,0,0,0,0,0,
+           0,0,0,0,0,1,0,0,0,1,
+           0,0,0,0,0,0,1,0,0,0,
+           0,0,0,0,0,0,0,1,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,1,0),ncol=10)
> A
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    0    0     0
 [2,]    1    0    0    0    0    0    0    0    0     0
 [3,]    0    1    0    0    0    0    0    0    0     0
 [4,]    0    0    1    0    0    0    0    0    0     0
 [5,]    0    0    0    1    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    0    0    0    1    0    0    0     0
 [8,]    0    0    0    0    0    0    1    0    0     0
 [9,]    0    0    0    0    0    0    0    0    0     1
[10,]    0    0    0    0    1    0    0    0    0     0
> B<-matrix(c(0,0,0,0,1,0,0,0,0,0,
+           1,0,1,0,0,0,0,0,0,0, 
+           0,0,0,0,0,0,1,0,0,0,
+           0,0,1,0,0,0,0,0,0,0,
+           0,0,0,0,0,1,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,1,0,
+           0,0,0,1,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,1,
+           0,0,0,0,0,0,0,0,0,0),ncol=10)
> A
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    0    0     0
 [2,]    1    0    0    0    0    0    0    0    0     0
 [3,]    0    1    0    0    0    0    0    0    0     0
 [4,]    0    0    1    0    0    0    0    0    0     0
 [5,]    0    0    0    1    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    0    0    0    1    0    0    0     0
 [8,]    0    0    0    0    0    0    1    0    0     0
 [9,]    0    0    0    0    0    0    0    0    0     1
[10,]    0    0    0    0    1    0    0    0    0     0
> B
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    1    0    0    0    0    0    0    0     0
 [2,]    0    0    0    0    0    0    0    0    0     0
 [3,]    0    1    0    1    0    0    0    0    0     0
 [4,]    0    0    0    0    0    0    0    1    0     0
 [5,]    1    0    0    0    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    1    0    0    0    0    0    0     0
 [8,]    0    0    0    0    0    0    0    0    0     0
 [9,]    0    0    0    0    0    0    1    0    0     0
[10,]    0    0    0    0    0    0    0    0    1     0
> efficiency(A)
[1] 1
> efficiency(B)
[1] 1

If you symmetrize - which is what you are doing in your example to one of the networks by setting "directed=FALSE" - then you get an efficiency of 0.889. Note that this matches what we should get:

1 - (18 - 9)/(choose(10,2)*2-9) = 0.889

(Remember that Krackhardt efficiency treats all networks as digraphs, so mutual edges count as two edges. Also, I note that you seem to have miscopied the formula from the man page, which may be part of your confusion.)

Your third matrix is again efficiency 1, since it has no excess edges:

> C<-matrix(c(0,0,0,0,1,0,0,0,0,0,
+           1,0,0,0,0,0,0,0,0,0, 
+           0,0,0,0,0,0,1,0,0,0,
+           0,0,1,0,0,0,0,0,0,0,
+           0,0,0,0,0,1,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,1,0,
+           0,0,0,1,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,1,
+           0,0,0,0,0,0,0,0,0,0),ncol=10)
> C
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    1    0    0    0    0    0    0    0     0
 [2,]    0    0    0    0    0    0    0    0    0     0
 [3,]    0    0    0    1    0    0    0    0    0     0
 [4,]    0    0    0    0    0    0    0    1    0     0
 [5,]    1    0    0    0    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    1    0    0    0    0    0    0     0
 [8,]    0    0    0    0    0    0    0    0    0     0
 [9,]    0    0    0    0    0    0    1    0    0     0
[10,]    0    0    0    0    0    0    0    0    1     0
> efficiency(C)
[1] 1

Your problem stems from symmetrizing (coercing C into a network object using directed=FALSE). This adds extra edges, leading to

1- (16 - 3 - 5) / (choose(4,2)*2 - 3 + choose(6,2)*2 - 5) = 0.7647059

which is equivalent to what sna gives you:

> efficiency(symmetrize(C))
[1] 0.7647059

Hope that clears things up!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM