简体   繁体   中英

Find shortest triangle in a graph

I have a set of points and need to select an optimal subset of 3 of them, where the criterion is a linear sum of some properties of the points, and some properties of pairs of the points.

In Python, this is quite easy using itertools.combinations :

all_points = combinations(points, 3)
costs = []
for i, (p1, p2, p3) in enumerate(all_points):
    costs.append((p1.weight + p2.weight + p3.weight
                  + pair_weight(p1, p2) + pair_weight(p1, p3) + pair_weight(p2, p3),
                 i))
costs.sort()
best = all_points[costs[0][1]]

The problem is that this is a brute force solution, requiring to enumerate all possible combinations of 3 points, which is O(n^3) in the number of points and therefore easily leads to a very large number of evaluations to perform. I have been trying to research whether there is a more efficient way to do this, perhaps taking advantage of the linearity of the cost function.

I have tried turning this into a networkx graph featuring node and edge weights. However, I have not yet found an algorithm in that toolkit that can calculate the "shortest triangle", particularly one that considers both edge and node weights. (Shortest path algorithms tend to only consider edge weights for example.)

There are functions to enumerate all cliques, and then I can select 3-cliques, and calculate the cost, but this is also brute force and therefore not better than doing it with combinations as above.

Are there any other algorithms I can look at?

By the way, if I do not have the edge weights, it is easy to just sort the nodes by their node-weight and choose the first three. So it is really the paired costs that add complexity to this problem. I am wondering if somehow I can just list all pairs and find the top-k of those that form triangles, or something better? At least if I could efficiently enumerate top candidates and stop the enumeration on some heuristic, it might be better than the brute force approach.

From now on, I will use n as the number of nodes and m as the number of edges. If your graph is fully connected, then m is just n choose 2. I'll also disregard node weights, because as the comments to your initial post have noted, the node weights can be absorbed into the edges they're connected to.

Your algorithm is O(n^3) ; it's hopefully not too hard to see why: You iterate over every possible triplet of nodes. However, it is possible to iterate over every triangle in a graph in O(m sqrt(m)) :

for every node u:
    for every node v adjacent to u:
        if degree(u) < degree(v): continue;
        for every node w adjacent to v:
            if degree(v) < degree(w): continue;
            if u is not connected to w: continue;
            // <u,v,w> is a triangle!

The proof for this algorithm's runtime of O(m sqrt(m)) is nontrivial, so I'll direct you here: https://cs.stanford.edu/~rishig/courses/ref/l1.pdf

If your graph is fully connected, then you've gotta stick with the O(n^3) , I think. There might be some early-pruning ideas you can do but they won't lead to a significant speedup, probably 2x at very best.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM