简体   繁体   中英

Betweenness and shortest paths in R

I've been reviewing this interesting article:

http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/

As an exercise, I've been picking through the various steps taken to convict Mr. Revere of treason. At one point, the author uses the betweenness function of the igraph library , which is described as:

"The vertex and edge betweenness are (roughly) defined by the number of geodesics (shortest paths) going through a vertex or an edge."

So, in the case of the article, how many shortest communication paths between pairs of people go through each of the 254 people being considered? I diverted a little from the article, though, and I'm wondering if I'm thinking naively.

A 254 x 254 matrix has 64516 elements. However, trivial elements (those on the diagonal-- a person talking to herself is obviously the shortest path from X to X) can be discounted, leaving (it seems) 254 * 254 - 254 = 64262 total nontrivial ordered pairings. But, these are not directional-- that is, the shortest path between a particular pair X and Y is the same, regardless of which of X or Y is the sender and which is the receiver.

So, we can reduce our number of pairings: (254 * 254 - 254) / 2 = 32131 .

Since this also happens to be the number of combinations of 2 selected from 254, even better-- a fine coincidence! ;-)

Then, just for fun, I did:

((254 * 254 - 254) / 2) - sum(betweenness(person.g)) = 10061

What does this number mean? It almost seems to say that there are 10,061 pairings for whom no path exists, but I don't see how that can be. Do I misunderstand betweenness? Many thanks in advance.

If you check what happens on a simpler graph, you will notice that shortest paths of length 1 do not enter the computation.

betweenness( graph.lattice( 3 ) )
# [1] 0 1 0

Shortest paths of length 2 will be used once (for the point in the middle), but shortest paths of length 3 or more will be used several times: once for each point in the middle.

betweenness( graph.lattice( 5 ) )
# [1] 0 3 4 3 0

In this example, the shortest paths are

length 1: 1-2, 2-3, 3-4, 4-5  (not used)
length 2: 1-3, 2-4, 3-5       (each used once, for the betweenness of 2, 3 and 4)
length 3: 1-4, 2-5            (each used twice, for 2,3 and 2,4)
length 4: 1-5                 (each used 3 times, for 2, 3 and 4)

In other words, a shortest path of length k is counted k-1 times.

p <- shortest.paths(person.g)
sum( p[upper.tri(p)] - 1 )
# [1] 22070
sum( betweenness( person.g ) )
# [1] 22070

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM