简体   繁体   中英

Using R / igraph, is there a way to find a shortest path between nodes taking the count of unique node attributes into account?

I'm trying to figure out how to find the shortest path between two nodes on a graph, using both the weights of the edges and an arbitrary penalty based on the count of the unique attributes of the nodes used.

I'm trying to keep the problem as general as possible.

For example, consider the following graph (simplified to demonstrate):

在此处输入图片说明

In R, I might construct my igraph object like:

EDIT - fixed typo in the edge weights!

library(igraph)

nodes = data.frame(id=c("A","B","C","D","E","F","G","H","I"),
                   colour = c("blue","blue","blue","blue","blue",
                              "red","green","yellow", "red"))

edges = data.frame(from = c("A","B","C","D","E","F","G","H","I","I","I"),
                   to = c("B","C","D","E","F","G","H","A","A","E","F"),
                   #weight = c(5,6,7,5,1,5,3,2,8,8,4)) <<- TYPO - SORRY
                   weight = c(5,6,9,5,1,5,3,2,8,8,4))

g = graph_from_data_frame(edges, directed = F, vertices = nodes)

To get from A to E, I can use igraph functions to calculate the shortest path.

shortest_paths(g, from="A",to="E", output="vpath")[[1]] returns the path AHGFE with distances(g, "A","E") total of 11.

However - what if I wanted to add a penalty based on the count of unique colours of the nodes? ie for every new coloured node it passes through, the weight is +10.

AHGFE = edge weight of 11 and uses 4 colours, so total weight is 11+(4*10)=51

ABCDE = edge weight of 25 uses 1 colour, total weight is 25+(1*10)=35

AIE = edge weight of 16, uses 2 colours, total weight is 16+(2*10)=36

AIFE = edge weight of 13, uses 2 colours, total weight is 13+(2*10)=33

AIFE is now the "shortest route".

I think (naively) that the weight needs to be a function of the previously visited node history.

I'm currently playing with igraph bfs and dfs without much success to try and understand how it's working but I think I'm barking up the wrong tree.

Before I try and reinvent the wheel, is there an existing igraph 'out of the box` function I'm missing to solve this?

I'm using R 3.6.0 and igraph 1.2.4.1 but can understand Python also.

TIA

An interesting question. I would define a custom distance/weight function that extracts all paths connecting two vertices from your graph; for every path we then (1) calculate the sum of edge weights (this is in essence what distances does), (2) determine the number of unique colours and multiply that value by 10, and (3) calculate a score as the sum of edge weights and scaled numbers of unique colour. The "optimal" path is then given by the path with the lowest overall score.

Here goes:

min_wghtd_dist <- function(g, from, to) {
    pth <- all_simple_paths(g, from, to)
    score <- sapply(pth, function(x) {
        edge_ids <- get.edge.ids(g, head(rep(x, each = 2), -1)[-1])
        edge_sum <- sum(E(g)[edge_ids]$weight)
        col_wght <- 10 * length(unique(V(g)[x]$colour))
        edge_sum + col_wght
    })
    list(path = pth[[which.min(score)]], score = min(score))
}

min_wghtd_dist(g, "A", "E")
#$path
#+ 4/9 vertices, named, from 09c0a2a:
#[1] A I F E
#
#$score
#[1] 33

The path A->I->F->E is correctly identified as the optimal path according to your requirements.


Sample data

library(igraph)

nodes = data.frame(id=c("A","B","C","D","E","F","G","H","I"),
                   colour = c("blue","blue","blue","blue","blue",
                              "red","green","yellow", "red"))

edges = data.frame(from = c("A","B","C","D","E","F","G","H","I","I","I"),
                   to = c("B","C","D","E","F","G","H","A","A","E","F"),
                   weight = c(5,6,9,5,1,5,3,2,8,8,4))

g = graph_from_data_frame(edges, directed = F, vertices = nodes)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM