[英]Find all node predecessors in a directed graph
I need to find all the connected elements in column ID.我需要在列 ID 中找到所有连接的元素。
for eg My main Element is 4120882840
in ID
Column.例如,我的主要元素是
ID
列中的4120882840
。
4120882840
is connected to 4120874920, 4120874720
(refer column ID2
) likewise, 4120874920
is connected to 4121482000
which is further connected to 4121480930
and so on 4120882840
连接到4120874920, 4120874720
(参见列ID2
)同样, 4120874920
连接到4121482000
进一步连接到4121480930
等等
finally, all the elements that are connected to 4120882840
are [4120882840, 4120874920, 4121482000, 4121480930, 4121480780, 4120874720, 4120871840, 4120871830]
total 8 in the list最后,所有连接到
4120882840
的元素是[4120882840, 4120874920, 4121482000, 4121480930, 4121480780, 4120874720, 4120871840, 4120871830]
列表中共有 8 个
But I am only getting first 7 ie [4120882840, 4120874920, 4121482000, 4121480930, 4121480780, 4120874720, 4120871840]
但我只得到前 7 个,即
[4120882840, 4120874920, 4121482000, 4121480930, 4121480780, 4120874720, 4120871840]
file link https://drive.google.com/file/d/1E5_cbGjtKoB6RDSsC7ned-X2RFoF6Rad/view?usp=sharing文件链接https://drive.google.com/file/d/1E5_cbGjtKoB6RDSsC7ned-X2RFoF6Rad/view?usp=sharing
This is my Code这是我的代码
import pandas as pd
df = pd.read_csv("Test.csv")
ID = df.iloc[:,1]
ID2 = df.iloc[:,2]
x = [4120882840]
for i in range (len(ID)):
for element in x:
if element == ID2[i]:
newID = ID[i]
#print (newID)
x.append (newID)
print (x)
It looks like you want to find all predecessors of the node in question.看起来您想要找到相关节点的所有前辈。 This becomes clearer by inspecting the corresponding component subgraph:
通过检查相应的组件子图,这一点变得更加清晰:
G = nx.from_pandas_edgelist(df, source='ID', target='ID2',
create_using=nx.DiGraph)
comps = nx.weakly_connected_components(G)
comp = next(comp for comp in comps if 4120882840 in comp)
H = nx.subgraph(G, comp)
plt.figure(figsize=(10,4))
nx.draw(H, node_color='lightgreen', with_labels=True, node_size=500)
We can use to find the node's predecessors.我们可以用它来查找节点的前辈。 NetworkX has
nx.edge_dfs
, where we can set orientation='reverse'
to traverse every predecessor edge in reverse order ( upstream ). NetworkX 有
nx.edge_dfs
,我们可以在其中设置orientation='reverse'
以逆序(上游)遍历每个前驱边。 Then we can just flatten the returned list of tuples to obtain the corresponding nodes:然后我们可以将返回的元组列表展平以获得相应的节点:
from itertools import chain
source = 4120882840
*n, _ = zip(*(nx.edge_dfs(G, source, orientation='reverse')))
print(set(chain.from_iterable(n)))
{4120874720, 4120871840, 4121480930, 4120874920, 4121480780,
4121482000, 4120871830, 4120882840}
Is this what you're looking for?这是你要找的吗?
import networkx as nx
graph = nx.from_pandas_edgelist(df, source='ID', target='ID2',
create_using=nx.DiGraph)
visited = set()
to_visit = [4120882840]
while to_visit:
dst = to_visit.pop()
visited.add(dst)
for parent in graph.predecessors(dst):
if parent in visited:
continue
to_visit.append(parent)
print(visited)
Output Output
{4120874720, 4120871840, 4121480930, 4120874920, 4121480780, 4121482000, 4120871830, 4120882840}
You can use mask to check if the element in ID2
and you can use list(set(x))
to remove duplicate您可以使用掩码检查
ID2
中的元素,您可以使用list(set(x))
删除重复项
Here is an algorithm that work:这是一个有效的算法:
x = [4120882840]
lastLen = 0
while lastLen != len(x):
lastLen = len(x)
for i in x:
x += list(df['ID'][df['ID2'] == i])
x = list(set(x))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.