简体   繁体   English

给定无向图中的边,在最大化图的度数的同时限制最大图度的算法是什么?

[英]Given edges in a undirected graph, what is an algorithm for limiting the maximum degree of the graph while maximizing the degree of the graph?

This is for my research in protein folding (So I guess technically a school project) 这是为了我对蛋白质折叠的研究(所以我想从技术上讲是一个学校项目)

Summary: I have the edges of an weighted undirected graph. 摘要:我有一个加权无向图的边缘。 Each vertex of the graph has anywhere from 1 to 20-ish edges. 图的每个顶点都有1到20个ish边缘。 I would like to trim this graph down such that no vertex has more than 6 edges. 我想修整此图,以使顶点的边缘不超过6个。 I would also like the graph to retain as much connectivity as possible (maximize the degree). 我还希望该图保留尽可能多的连接性(最大程度)。

Background: I have a Delaunay Tesselation of the atoms (pointcloud essentially) in a protein using the scipy library. 背景:我使用scipy库对蛋白质中的原子进行Delaunay镶嵌(本质上是点云)。 I use this to create a list of all pairs of residues that are in contact with each other (I store the distance between them). 我用它来创建一个列表,列出彼此接触的所有残基对(我存储了它们之间的距离)。 This list contains every pair (twice), and the distance between the pairs. 此列表包含每对(两次)以及两对之间的距离。 (The residue contains many atoms so I use the average position of them to get the position of the residue) (残基包含许多原子,因此我使用它们的平均位置来获得残基的位置)

pairs
[(ALA 1, GLU 2, 2.7432), (ALA 1, GLU 2, 2.7432), (ALA 4, ASP 27, 4.8938), (ALA 4, ASP 27, 4.8938) ... ]

What I have tried (which works but isn't exactly what I want) is to only store the six closest contacts. 我尝试过的(可以工作,但不完全是我想要的)只是存储六个最近的联系人。 (I sort the residue names so I can use collections later) (我对残基名称进行排序,以便以后可以使用集合)

for contact in residue.contacts[:6]:
    pairs.append( tuple( sorted([residue.name, contact.name], key=lambda r: r.name) + [residue.dist[contact]] ) )

And then remove any contacts that are not reciprocated. 然后删除所有不往复的联系人。 (I guess technically add contacts that are) (我想从技术上来说就是添加联系人)

new_pairs = []
counter=collections.Counter(pairs)
for key, val in counter.items():
    if val == 2:
        new_pairs.append(key)

This works, but I lose some information that I would like to keep. 这行得通,但是我丢失了一些我想保留的信息。 I phrased the question as a graph theory problem because I feel like this problem has already been solved in that field. 我将这个问题称为图论问题,因为我觉得这个问题已经在该领域解决了。

I was thinking that greedy algorithm might work: 我以为贪婪算法可能会起作用:

while run_greedy:
    # find the residue with the maximum number of neighbors
    # find that residues pair with the maximum number of neighbors but only if the pair exists in pairs
    # remove that pair from pairs

    # if maximum_degree <= 6: run_greedy = False

Does the greedy algorithm work? 贪婪算法有效吗? Are there known algorithms that do this well? 有已知的算法可以很好地做到这一点吗? Is there a library that can do this (I am more than willing to change the format of the data to fit the library)? 是否有一个图书馆可以做到这一点(我更愿意更改数据格式以适合该图书馆)?

I hope this is enough information, Thanks in advance for the help. 我希望这是足够的信息,在此先感谢您的帮助。

EDIT this is an variant of the knapsack problem : you add edges one by one, and want to maximize the number of edges while the graph built doesn't exceed a given degree. 编辑这是背包问题的一个变体:您逐个添加边,并希望在构建的图不超过给定度的同时最大化边的数量。

The following solution uses dynamic programming. 以下解决方案使用动态编程。

Let m[i, d] the maximum subset of edges in e_0, ..., e_{i-1} creating a subgraph of maximium degree <= d . m[i, d]e_0, ..., e_{i-1}的边的最大子集创建最大度<= d的子图。

  • m[i, 0] = {}
  • m[0, d] = {}
  • m[i, d] = m[i-1, d] + {e_i} if the degree of the graph is <= d 如果图的度数<= d m[i, d] = m[i-1, d] + {e_i}
  • m[i, d] = m[i-1, d-1] + {e_i} if it has more edges than m[i-1][d] , else m[i-1][d] . m[i, d] = m[i-1, d-1] + {e_i}边比m[i-1][d] ,则m[i, d] = m[i-1, d-1] + {e_i} ,否则m[i-1][d]

Hence the algorithm (not tested): 因此,该算法(未经测试):

for i in 0..N:
    m[i][0] = {}
for d in 1..K:
    m[0][d] = {}

for d in 1..K:
    for i in 1..N:
        G1 = m[i-1][d] + {e_i}
        if D(G1) == d: # can add e_i with degree <= k
            m[i][d] = G1
        else:
            m[i][d] = max(m[i-1][d-1] + {e_i}, m[i-1][d]) # key=cardinal

Solution is: m[N-1][K-1] . 解决方案是: m[N-1][K-1] Time complexity is O(KN^2) (imbricated loops : KN + maximum degre of the graph in N or less) 时间复杂度为O(KN^2) (简化的循环: KN +图形的最大降幅( N或更小))

Previous answer 先前的答案

TLDR; TLDR; I don't know how to find an optimal solution, but a greedy algorithm might give you acceptable result. 我不知道如何找到最佳解决方案,但是贪婪的算法可能会给您令人满意的结果。

The problem 问题

Let me rephrase the problem, based on your question and your code: you want to remove a minimum number of edges from your graph in order to reduce the maximum degree the graph to 6 . 让我根据您的问题和代码来重述该问题:您希望从图形中删除最少数量的边,以将图形的最大程度减少到6 That is to get the maximal subgraph G' from G with D(u) <= 6 for all u in G' . 这是获得最大的子图G'GD(u) <= 6 for all u in G'

The closest idea I found is the K-core of a graph , but that's not exactly the same problem. 我发现的最接近的想法是图的K核心 ,但这不是完全相同的问题。

Your method 你的方法

Your method is clearly not optimal, since you keep at most 6 edges of every vertex and recreate the graph with those edges. 您的方法显然不是最佳方法,因为您最多保留每个顶点的6边并使用这些边重新创建图形。 Take the graph ABC : 拿图ABC

A -> 1. B, 2. C
B -> 1. C, 2. A
C -> 1. A, 2. B

If you try to reduce the maximum degree of this graph to 1 using your method, the first pass will remove AB ( B is the 2nd neighbor of A ), BA ( A is the 2nd neighbor of B ) and CB ( B is the 2nd neighbor of C ): 如果尝试使用您的方法将此图的最大程度减小为1 ,则第一遍将删除ABBA的第二个邻居), BAAB的第二个邻居)和CBB是第二个) C邻居):

A -> 1. B
B -> 1. C
C -> 1. A

The second pass, to insure that the graph is undirected, will remove all the remaining edges (and vertices). 为了确保图形没有方向,第二遍将删除所有剩余的边(和顶点)。

An optimal reduction would be: 最佳减少量为:

A -> 1. B
B -> 1. A

Or any other pair of vertices in A , B , C . ABC任何其他一对顶点。

Some strategy 一些策略

Let: 让:

  • k = 6
  • D(u) = max(d(u)-k, 0) : the number of neighbors above k , or 0 D(u) = max(d(u)-k, 0)k之上的邻居k0
  • w(uv) (resp s(uv) ) = the weak (resp. strong) endpoint of the edge: having the lowest (resp. highest) degree w(uv) (resp s(uv) )=边缘的弱点(强点):最低(最高)
  • m(uv) = min(D(u), D(v))
  • M(uv) = max(D(u), D(v))

Let S = sum(D(u) for u in G) . S = sum(D(u) for u in G) The goal is to make S = 0 while removing a minimum number of edges. 目的是使S = 0同时去除最少数量的边。 If you remove: 如果删除:

(1) a floating edge: m(uv) > 0 , then S is decreased by 2 (both endpoints loose 1 degree) (1)浮边: m(uv) > 0 ,然后S减小2 (两个端点都松开1度)

(2) a sinking edge: m(uv) = 0 and M(uv) > 0 , then S is decreased by 1 (the degree of the weak endpoint is already <= 6 ) (2)下陷边缘: m(uv) = 0M(uv) > 0 ,然后S减小1 (弱端点的度已经<= 6

(3) a sunk edge: M(uv) = 0 , then S is unchanged (3)下沉边缘: M(uv) = 0 ,则S不变

Note that a floating edge may become a sinking edge if: 1. its weak endpoint has a degree of k+1 ; 请注意,在以下情况下,浮动边缘可能会成为下沉边缘:1.其弱端点的度数为k+1 2. you remove another edge connected to this endpoint. 2.删除连接到该端点的另一条边。 Similarly, a sinking edge can sunk. 同样,下沉的边缘也会下沉。

You have to remove floating edges while avoid creating sinking edges, because removing a floating edges is more efficient to reduce S . 您必须删除浮动边缘,同时避免创建下沉边缘,因为删除浮动边缘会更有效地降低S Let K the number of floating edges removed, and L the number of sinking edges removed (we don't remove sunk edges) to make S = 0 . K为移除的浮动边的数量,为L为移除的沉降边的数量(我们不移除沉没的边),使S = 0 We want 2*K + L >= S . 我们想要2*K + L >= S Obviously, the idea is to make L as small a possible, because we want a small number of edges removed ( K + L ). 显然,这样做的想法是使L尽可能小,因为我们希望去除少量的边( K + L )。

I doubt you'll find an optimal greedy algorithm, because everything depends on the order of removing and the remote consequences of the current removing are hard to predict. 我怀疑您会找到最佳的贪婪算法,因为一切都取决于删除的顺序,而且当前删除的远程后果很难预测。

But you can use a general strategy to limit the creation of sinking edges: 但是您可以使用一般策略来限制下沉边缘的创建:

  1. do not remove edges with m(uv) = 1 unless you have no choice. 除非您别无选择,否则不要移除m(uv) = 1边。
  2. if you have to remove an edge with m(uv) = 1 , choose the one whose weak endpoint has the less floating edges (they will become sinking edges). 如果必须删除m(uv) = 1的边缘,请选择其弱端点的浮动边缘较少的边缘(它们将变为下陷边缘)。

An algorithm 一种算法

Here's a greedy algorithm that implements this strategy: 这是实现此策略的贪婪算法:

while {u, v in G | m(u-v) > 0} is not empty: // remove floating edges first
    remove the edge u-v with:
        1. the maxmimum m(u-v)
        2. w(u-v) has the minimum of neighbors t with D(t) > 0
        3. s(u-v) has the minimum of neighbors t with D(t) > 0

remove all edges from {u, v in G | M(u-v) > 0} // clean up sinking edges
clean orphan vertices

Termination the algorithm terminates because we remove an edge on each iteration, thus {u in G | D(u) > 0} 终止算法终止是因为我们在每次迭代中都删除了一条边,因此{u in G | D(u) > 0} {u in G | D(u) > 0} will become empty at some point. {u in G | D(u) > 0}在某些时候将变为空。

Note: you can use a heap and update m(uv) after each removing. 注意:每次删除后,您都可以使用堆并更新m(uv)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM