在有向图中查找所有节点前驱

Question

我需要在列 ID 中找到所有连接的元素。

例如，我的主要元素是ID列中的4120882840 。

4120882840连接到4120874920, 4120874720 （参见列ID2 ）同样， 4120874920连接到4121482000进一步连接到4121480930等等

最后，所有连接到4120882840的元素是[4120882840, 4120874920, 4121482000, 4121480930, 4121480780, 4120874720, 4120871840, 4120871830]列表中共有 8 个

但我只得到前 7 个，即[4120882840, 4120874920, 4121482000, 4121480930, 4121480780, 4120874720, 4120871840]

文件链接https://drive.google.com/file/d/1E5_cbGjtKoB6RDSsC7ned-X2RFoF6Rad/view?usp=sharing

这是我的代码

import pandas as pd
df = pd.read_csv("Test.csv")
ID = df.iloc[:,1] 
ID2 = df.iloc[:,2] 

x = [4120882840]
for i in range (len(ID)):
    for element in x:
        if element == ID2[i]:
            newID = ID[i]
            #print (newID)
            x.append (newID)
print (x)

Answer 1

看起来您想要找到相关节点的所有前辈。 通过检查相应的组件子图，这一点变得更加清晰：

G = nx.from_pandas_edgelist(df, source='ID', target='ID2',
                                create_using=nx.DiGraph)

comps = nx.weakly_connected_components(G)
comp = next(comp for comp in comps if 4120882840 in comp)
H = nx.subgraph(G, comp)
plt.figure(figsize=(10,4))
nx.draw(H, node_color='lightgreen', with_labels=True, node_size=500)

我们可以用它来查找节点的前辈。 NetworkX 有nx.edge_dfs ，我们可以在其中设置orientation='reverse'以逆序（上游）遍历每个前驱边。 然后我们可以将返回的元组列表展平以获得相应的节点：

from itertools import chain 
source = 4120882840

*n, _ = zip(*(nx.edge_dfs(G, source, orientation='reverse')))
print(set(chain.from_iterable(n)))
{4120874720, 4120871840, 4121480930, 4120874920, 4121480780, 
 4121482000, 4120871830, 4120882840}

Answer 2

这是你要找的吗？

import networkx as nx
graph = nx.from_pandas_edgelist(df, source='ID', target='ID2',
                                create_using=nx.DiGraph)
visited = set()
to_visit = [4120882840]
while to_visit:
    dst = to_visit.pop()
    visited.add(dst)
    for parent in graph.predecessors(dst):
        if parent in visited:
            continue
        to_visit.append(parent)
print(visited)

Output

{4120874720, 4120871840, 4121480930, 4120874920, 4121480780, 4121482000, 4120871830, 4120882840}

Answer 3

您可以使用掩码检查ID2中的元素，您可以使用list(set(x))删除重复项

这是一个有效的算法：

x = [4120882840]
lastLen = 0
while lastLen != len(x):
    lastLen = len(x)
    for i in x:
        x += list(df['ID'][df['ID2'] == i])
        x = list(set(x))

在有向图中查找所有节点前驱

问题描述

3 个解决方案

解决方案1
1 2020-07-27 08:20:28

解决方案2
0 2020-07-26 19:36:12

解决方案3
0 2020-07-26 20:50:17

在有向图中查找所有节点前驱

问题描述

3 个解决方案

解决方案1 1 2020-07-27 08:20:28

解决方案2 0 2020-07-26 19:36:12

解决方案3 0 2020-07-26 20:50:17

解决方案1
1 2020-07-27 08:20:28

解决方案2
0 2020-07-26 19:36:12

解决方案3
0 2020-07-26 20:50:17