[英]DFS algorithm in Python with generators
我正在研究一個項目,我需要為文本處理編寫一些規則。 在完成這個項目幾天並實施一些規則后,我意識到我需要確定規則的順序。 沒問題,我們有拓撲排序幫忙。 但后來我意識到我不能指望圖表總是滿滿的。 所以我想出了這個想法,給定一個帶有一組依賴關系(或單個依賴關系)的規則,我需要檢查依賴關系的依賴關系。 聽起來很熟悉? 是。 該主題與圖的深度優先搜索非常相似。
我不是數學家,也不研究CS因此,圖論對我來說是一個新的領域。 盡管如此,我實施了一些工作(見下文)(我懷疑效率低下)。
這是我的搜索和產量算法。 如果您在下面的示例中運行它,您將看到它多次訪問某些節點。 因此,推測效率低下。
關於輸入的一個詞。 我寫的規則基本上是python類,它們具有類屬性depends
。 我被批評沒有使用inspect.getmro
- 但這會使事情變得非常復雜,因為這個類需要相互繼承( 參見這里的例子 )
def _yield_name_dep(rules_deps):
global recursion_counter
recursion_counter = recursion_counter +1
# yield all rules by their named and dependencies
for rule, dep in rules_deps.items():
if not dep:
yield rule, dep
continue
else:
yield rule, dep
for ii in dep:
i = getattr(rules, ii)
instance = i()
if instance.depends:
new_dep={str(instance): instance.depends}
for dep in _yield_name_dep(new_dep):
yield dep
else:
yield str(instance), instance.depends
好了,既然你盯着代碼,這里有一些你可以測試的輸入:
demo_class_content ="""
class A(object):
depends = ('B')
def __str__(self):
return self.__class__.__name__
class B(object):
depends = ('C','F')
def __str__(self):
return self.__class__.__name__
class C(object):
depends = ('D', 'E')
def __str__(self):
return self.__class__.__name__
class D(object):
depends = None
def __str__(self):
return self.__class__.__name__
class F(object):
depends = ('E')
def __str__(self):
return self.__class__.__name__
class E(object):
depends = None
def __str__(self):
return self.__class__.__name__
"""
with open('demo_classes.py', 'w') as clsdemo:
clsdemo.write(demo_class_content)
import demo_classes as rules
rule_start={'A': ('B')}
def _yield_name_dep(rules_deps):
# yield all rules by their named and dependencies
for rule, dep in rules_deps.items():
if not dep:
yield rule, dep
continue
else:
yield rule, dep
for ii in dep:
i = getattr(rules, ii)
instance = i()
if instance.depends:
new_dep={str(instance): instance.depends}
for dep in _yield_name_dep(new_dep):
yield dep
else:
yield str(instance), instance.depends
if __name__ == '__main__':
# this is yielding nodes visited multiple times,
# list(_yield_name_dep(rule_start))
# hence, my work around was to use set() ...
rule_dependencies = list(set(_yield_name_dep(rule_start)))
print rule_dependencies
為了省去運行代碼的麻煩,上面函數的輸出是:
>>> print list(_yield_name_dep(rule_wd))
[('A', 'B'), ('B', ('C', 'F')), ('C', ('D', 'E')), ('D', None), ('E', None), ('F', 'E'), ('E', None)]
>>> print list(set(_yield_name_dep(rule_wd)))
[('B', ('C', 'F')), ('E', None), ('D', None), ('F', 'E'), ('C', ('D', 'E')), ('A', 'B')]
在我提出更好的解決方案的同時,上述問題仍然存在。 所以隨意批評我的解決方案:
visited = []
def _yield_name_dep_wvisited(rules_deps, visited):
# yield all rules by their name and dependencies
for rule, dep in rules_deps.items():
if not dep and rule not in visited:
yield rule, dep
visited.append(rule)
continue
elif rule not in visited:
yield rule, dep
visited.append(rule)
for ii in dep:
i = getattr(grules, ii)
instance = i()
if instance.depends:
new_dep={str(instance): instance.depends}
for dep in _yield_name_dep_wvisited(new_dep, visited):
if dep not in visited:
yield dep
elif str(instance) not in visited:
visited.append(str(instance))
yield str(instance), instance.depends
以上的輸出是:
>>>list(_yield_name_dep_wvisited(rule_wd, visited))
[('A', 'B'), ('B', ('C', 'F')), ('C', ('D', 'E')), ('D', None), ('E', None), ('F', 'E')]
因此,您現在可以看到節點E只被訪問過一次。
使用Gareth和Stackoverflow其他類型用戶的反饋,這就是我想出的。 它更清晰,也更通用:
def _dfs(start_nodes, rules, visited):
"""
Depth First Search
start_nodes - Dictionary of Rule with dependencies (as Tuples):
start_nodes = {'A': ('B','C')}
rules - Dictionary of Rules with dependencies (as Tuples):
e.g.
rules = {'A':('B','C'), 'B':('D','E'), 'C':('E','F'),
'D':(), 'E':(), 'F':()}
The above rules describe the following DAG:
A
/ \
B C
/ \ / \
D E F
usage:
>>> rules = {'A':('B','C'), 'B':('D','E'), 'C':('E','F'),
'D':(), 'E':(), 'F':()}
>>> visited = []
>>> list(_dfs({'A': ('B','C')}, rules, visited))
[('A', ('B', 'C')), ('B', ('D', 'E')), ('D', ()), ('E', ()),
('C', ('E', 'F')), ('F', ())]
"""
for rule, dep in start_nodes.items():
if rule not in visited:
yield rule, dep
visited.append(rule)
for ii in dep:
new_dep={ ii : rules[ii]}
for dep in _dfs(new_dep, rules, visited):
if dep not in visited:
yield dep
這是另一種在不重復訪問節點的情況下進行廣度優先搜索的方法。
import pylab
import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([x for x in 'ABCDEF'])
G.nodes()
返回['A','C','B','E','D','F']
G.add_edge('A','B')
G.add_edge('A','C')
G.add_edge('B','D')
G.add_edge('B','E')
G.add_edge('C','E')
G.add_edge('C','F')
以下是如何在不重復節點的情況下遍歷樹。
nx.traversal.dfs_successors(G)
返回{'A':['C','B'],'B':['D'],'C':['E','F']},您可以繪制圖形。
nx.draw(G,node_size=1000)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.