[英]How to find all triplets of nodes (connected components of size 3) from a tsv file?
从下面给出的矩阵中,我必须创建一个网络并找到大小为 3 的所有连接组件。我使用的数据集是:
0 1 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0
有人可以帮忙吗? 预期的连接三胞胎将是:-
1 2 3
1 3 4
2 1 3
3 1 4
我的代码:
import networkx as nx
from itertools import chain
import csv
import numpy as np
import pandas as pd
adj = []
infile="2_mutual_info_adjacency.tsv"
df=pd.read_csv(infile,delimiter="\t",header=None)
arr=np.array(df.iloc[0:10,:])
arr1=np.array(df.iloc[:,0:10])
for i in range(arr):
for j in range(arr1):
if (i,j)==1:
for k in range(j+1,arr1):
if (i,k)==1:
adj.append(i,j,k)
for l in range(i+1,arr):
if(l,j)==1:
adj.append(i,j,l)
一点点帮助将不胜感激。 先感谢您。
您可以使用函数connected_componets()
找到所有连接的组件。 随后您可以过滤掉由三个节点组成的组件:
import networkx as nx
import pandas as pd
from itertools import chain
adj_matrix = [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
]
df = pd.DataFrame(adj_matrix)
G = nx.from_pandas_adjacency(df)
# filter components of size 3
triplets = [c for c in nx.connected_components(G) if len(c) == 3]
triplets = set(chain.from_iterable(triplets))
color = ['lime' if n in triplets else 'pink' for n in G.nodes()]
# jupyter notebook
%matplotlib inline
nx.draw(G, with_labels=True, node_color=color, node_size=1000)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.