[英]how to visualize network graph using python and pandas?
我有以下 dataframe,其中包括文件编号字段和部门,其中每个文件编号分配给一个或多个部门,数字(1-0-99)代表文件的状态,例如
df=pd.DataFrame({'file':[1205,2897,1171,1322,4312,2211,1242,52,443,111],
'finance':[1,99,0,0,1,99,1,0,1,99],
'IT':[1,99,0,1,99,1,0,0,99,0],
'marketing':[1,1,0,99,1,99,1,1,0,1]})
file finance IT marketing
0 1205 1 1 1
1 2897 99 99 1
2 1171 0 0 0
3 1322 0 1 99
4 4312 1 99 1
5 2211 99 1 99
6 1242 1 0 1
7 52 0 0 1
8 443 1 99 0
9 111 99 0 1
我想问的是,是否可以使用网络图可视化这些文件和部门的网络?
预期结果是:
IIUC,你可以使用:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
# df = pd.DataFrame(...)
COLORS = {0: 'red', 1: 'green'}
edges = df.melt('file', var_name='department', value_name='status').query('status != 99')
G = nx.from_pandas_edgelist(edges, source='department', target='file', edge_attr=['status'])
pos = nx.shell_layout(G, [edges['department'].unique(), edges['file'].unique()])
node_colors = ['lightcoral' if node in edges['department'].unique() else 'yellow' for node in G.nodes]
edge_colors = [COLORS[G.edges[edge]['status']] for edge in G.edges]
nx.draw_networkx(G, pos=pos, with_labels=True, node_color=node_colors, edge_color=edge_colors)
plt.show()
>>> edges
file department status
0 1205 finance 1
2 1171 finance 0
3 1322 finance 0
4 4312 finance 1
6 1242 finance 1
7 52 finance 0
8 443 finance 1
10 1205 IT 1
12 1171 IT 0
13 1322 IT 1
15 2211 IT 1
16 1242 IT 0
17 52 IT 0
19 111 IT 0
20 1205 marketing 1
21 2897 marketing 1
22 1171 marketing 0
24 4312 marketing 1
26 1242 marketing 1
27 52 marketing 1
28 443 marketing 0
29 111 marketing 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.