[英]Serializing a directed, weighted graph
我有一個有向加權圖。 圖的每個節點表示為2元組,其第一個元素是節點的名稱,其第二個元素是包含源自此節點的所有頂點的元組,按其權重排序。 基本上每個頂點的權重是它在該元組內的索引。
免責聲明:
a = ('A', () )
a
是名稱為A的節點,其中沒有頂點。
b = ('B', () )
a = ('A', (b,) )
a
是名為A的節點,其中一個頂點指向名為B的節點,權重為0。
b = ('B', () )
c = ('C', () )
a = ('A', (b, c) )
a
是名為A的節點,其中兩個頂點指向名為B和C的節點,第一個是權重0,第二個是權重1。
很明顯('A', (b, c) )
不等於('A', (c, b) )
。
現在我需要根據這些規則序列化這個圖:
基本上,從低到高(重量)第一,深度第二。
這里有一個示例輸入和輸出:
f = ('F', () )
e = ('E', () )
d = ('D', (e,) )
c = ('C', (f, d, e) )
b = ('B', (d,) )
a = ('A', (b, c) )
結果是:
['A', 'B', 'C', 'D', 'F', 'E']
現在我的第一個天真的方法是:
def serialize (node):
acc = []
def serializeRec (node, level):
tbd = [] #acc items to be deleted
tbi = False #insertion index
for idx, item in enumerate (acc):
if item [1] > level and tbi == False:
tbi = idx
if item [0] == node [0]:
if item [1] > level: tbd.append (item)
else: break
else:
if tbi == False: acc.append ( (node [0], level) )
else: acc.insert (tbi, (node [0], level) )
for item in tbd:
acc.remove (item)
for vertex in node [1]:
serializeRec (vertex, level + 1)
serializeRec (node, 0)
#remove levels
return [node for node, level in acc]
這顯然是一個非常糟糕的主意,因為在每次遞歸中我都會迭代各種列表。 這就是我切換到字典的原因:
def serializeDict (node):
levels = defaultdict (list) #nodes on each level
nodes = {} #on which level is which node
def serializeRec (node, level):
try:
curLevel = nodes [node [0] ]
if curLevel > level:
nodes [node [0] ] = level
levels [curLevel].remove (node [0] )
levels [level].append (node [0] )
except:
nodes [node [0] ] = level
levels [level].append (node [0] )
for vertex in node [1]:
serializeRec (vertex, level + 1)
serializeRec (node, 0)
#flatten dict items
return [node for level in (v for _, v in sorted (levels.items (), key = lambda x: x [0] ) ) for node in level]
除非常小的圖表,其運行速度要快得多。
我現在的問題是:
如何以最小化運行時的目標優化此序列化?
內存使用無關緊要(是的,寶貝),KLOC無關緊要,只有運行時間。 除輸入數據的格式外,一切都可以更改。 但如果最后節省時間,我很樂意在序列化功能中重新組織這些數據。
我非常感謝你閱讀這篇TL; DR牆的文字。
鬼混的示例圖:
z = ('Z', () ); y = ('Y', (z,) ); x = ('X', (z, y) ); w = ('W', (x, y, z) ); v = ('V', (w, x) ); u = ('U', (w, v) ); t = ('T', (u, w) ); s = ('S', (z, v, u) ); r = ('R', (t, u, z) ); q = ('Q', (r, z) ); p = ('P', (w, u) ); o = ('O', (v, r, q) ); n = ('N', (r, z) ); m = ('M', (t,) ); l = ('L', (r,) ); k = ('K', (x, v) ); j = ('J', (u,) ); i = ('I', (n, k) ); h = ('H', (k, x) ); g = ('G', (l,) ); f = ('F', (t, m) ); e = ('E', (u,) ); d = ('D', (t, e, v) ); c = ('C', (m,) ); b = ('B', (n,) ); a = ('A', (g, m, v) )
這可以在沒有遞歸的情況下工作,並使用雙端隊列來提高效率:
from collections import deque
def serialize_plain(n):
name, children = n
output = [name]
candidates = deque(children)
while candidates:
cname, cchildren = candidates.popleft()
if cname not in output:
output.append(cname)
candidates.extend(cchildren)
return output
根據圖表的大小,保留已經處理的一組節點以避免昂貴的列表查詢可能是有意義的:
from collections import deque
def serialize_with_set(n):
name, children = n
output = [name]
done = {name}
candidates = deque(children)
while candidates:
cname, cchildren = candidates.popleft()
if cname not in done:
output.append(cname)
done.add(cname)
candidates.extend(cchildren)
return output
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.