简体   繁体   中英

How to extract certain path types in igraph?

TLDR: I'd like to extract the edge types of every path between two vertices in igraph. Is there a relatively sane way to do this?


The clinic I work for recently undertook a rather large (1400-person) tuberculosis contact investigation in a high school. I have class schedules for all of the students and teachers (!) and have put them into a network (using igraph in R), with each student and each room-period combination as a vertex (eg, the class in Room 123 in Period 1 is a vertex with a directed edge to the class that's in Room 123 for Period 2). I also know which rooms share ventilation systems - a plausible but unlikely mechanism for infection. The graph is directed out from sole source case, so every path on the network has only two people in it - the source and a contact, separated by a variable number of room-period vertices. Conceptually, there are four kinds of paths:

  • personal-contact exposures (source -> contact only)
  • shared-class exposures (source -> room-period -> contact)
  • next-period exposures (source-> Room 123 Period 1 -> Room 123 Period 2 -> contact)
  • ventilation exposures (source -> Room 123 Period 1 -> Room 125 Period 1 -> contact)

Every edge has an attribute indicating whether it's a person-to-person exposure, same-room-different-period, or ventilation edge.

As an intermediate step toward modeling infection on this network, I'd like to just get a simple count of how many exposures of each type a student has had. For example, a student might have shared a class with the source, then later have been in a room the source had been in but a period later, and perhaps the next day been in a ventilation-adjacent room. That student's indicators would then be:

personal.contact: 0
shared.class:     1
next.period:      1
vent:             1

I'm not sure how best to get this kind of info, though - I see functions for getting shortest paths, which makes identifying personal contact links easy, but I think I need to evaluat all paths (which seems like a crazy thing to ask for on a typical social network, but isn't so mad when only the source and the room-periods have out-edges). If I could get to the point where each source-to-contact path were represented by an ordered vector of edge types, I think I could subset them to my criteria easily. I just don't know how to get there. If igraph isn't the right framework for this and I just need to write some big horrible loops over the students' schedules, so be it! But I'd appreciate some guidance before I dive down that hole.


Here's a sample graph of a contact with each of the three indirect paths:

# Strings ain't factors
options(stringsAsFactors = FALSE)  
library(igraph)

# Create a sample case
edgelist <- data.frame(out.id = c("source", "source", 
                                  "source", "Rm 123 Period 1", 
                                  "Rm 125 Period 2", "Rm 125 Period 3", 
                                  "Rm 127 Period 4", "Rm 129 Period 4"),
                       in.id = c("Rm 123 Period 1", "Rm 125 Period 2", 
                                 "Rm 127 Period 4", "contact", 
                                 "Rm 125 Period 3", "contact", 
                                 "Rm 129 Period 4", "contact"),
                       edge.type = c("Source in class", "Source in class",
                                     "Source in class", "Student in class",
                                     "Class-to-class", 
                                     "Student in class", "Vent link",
                                     "Student in class"
                                     )
)

samp.graph <- graph.data.frame(edgelist, directed = TRUE)

# Label the vertices with meaningful names
V(samp.graph)$label <- V(samp.graph)$name

plot(samp.graph, layout = layout.fruchterman.reingold)

I'm not entirely sure that I understand your graph model, but if the question is:

I have two vertices and I wish to extract every path between them,
then extract the edge attributes of those edges.

then perhaps this might work.

Go with a breadth-first search. Igraph contains one but it's easy enough to roll your own, and this will give you more flexibility as to what information you want to get. I assume you have no cycles in your graph - otherwise you'll get an infinite number of paths. I don't know much Python (though I do use igraph in R), so here's some pseudocode.

list <- empty

allSimplePaths(u, v, thisPath)
  if (u == v) return
  for (n in neighborhood(u))
    if (n in thisPath)
      next
    if (u == v)
      list <- list + (thisPath + v)
  for (n in neighborhood(u))
    thisPath <- thisPath + n
    allSimplePaths(n, v, thisPath)
    thisPath <- thisPath - thisPath.end

Basically it says "from each vertex, try all possible paths of expansion to get to the end." It's a simple matter to add another thisPathEdges and insert edges, passing it through the function, as well as vertices. Of course this would run better were it not recursive. Be careful, as this algorithm might blow your stack with enough nodes.

You still might want to go with @PaulG 's model, and just have multiple edges between nodes of students. You could do cool things like run a breadth first search to see how the disease spread or find a minimum spanning tree to get a time estimate, or find a min-cut to quarantine an ongoing infection or something.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM