简体   繁体   English

Networkx:如何从 csv 文件创建图边?

[英]Networkx : How to create graph edges from a csv file?

I am trying to create a graph using.networkx and so far I have created nodes from the following text files: File 1(user_id.txt) sample data:我正在尝试使用 .networkx 创建一个图形,到目前为止,我已经从以下文本文件创建了节点:文件 1(user_id.txt) 示例数据:

user_000001
user_000002
user_000003
user_000004
user_000005
user_000006
user_000007

File 2(user_country.txt) sample data: contains few blank lines too in case if user didn't enter his country details文件 2(user_country.txt) 示例数据:如果用户没有输入他的国家详细信息,也包含一些空行

 Japan
 Peru
 United States

 Bulgaria
 Russian Federation
 United States

File 3(user_agegroup.txt) data: contains four age groups文件 3(user_agegroup.txt) 数据:包含四个年龄组

 [12-18],[19-25],[26-32],[33-39]

I have other two files with following sample data for adding edges in the graph我还有另外两个文件,其中包含以下示例数据,用于在图中添加边

File 4(id,agegroup.txt)文件 4(id,agegroup.txt)

user_000001,[19-25]
user_000002,[19-25]
user_000003,[33-39]
user_000004,[19-25]
user_000005,[19-25]
user_000006,[19-25]
user_000007,[26-32]

File 5(id,country.txt)文件 5(id,country.txt)

(user_000001,Japan)
(user_000002,Peru)
(user_000003,United States)
(user_000004,)
(user_000005,Bulgaria)
(user_000006,Russian Federation)
(user_000007,United States)

So far I have written following code to draw graphs with only nodes: (Please check the code because print g.number_of_nodes() never prints correct no. of nodes though print g.nodes() shows correct no. of nodes.)到目前为止,我已经编写了以下代码来绘制仅包含节点的图形:(请检查代码,因为print g.number_of_nodes()从不打印正确的节点数,尽管print g.nodes()显示正确的节点数。)

import csv
import networkx as nx
import matplotlib.pyplot as plt
g=nx.Graph()

#extract and add AGE_GROUP nodes in graph
f1 = csv.reader(open("user_agegroup.txt","rb"))
for row in f1: 
    g.add_nodes_from(row)
    nx.draw_circular(g,node_color='blue')

#extract and add COUNTRY nodes in graph
f2 = csv.reader(open('user_country.txt','rb'))
for row in f2:
    g.add_nodes_from(row) 
    nx.draw_circular(g,node_color='red')

#extract and add USER_ID nodes in graph
f3 = csv.reader(open('user_id.txt','rb'))
for row in f3:
    g.add_nodes_from(row)
    nx.draw_random(g,node_color='yellow')

print g.nodes()
plt.savefig("path.png")
print g.number_of_nodes()
plt.show()

Besides this I can't figure out how to add edges from file4 and file5.除此之外,我不知道如何从 file4 和 file5 添加边。 Any help with code for that is appreciated.对此代码的任何帮助表示赞赏。 Thanks.谢谢。

For simplification I made user ID's [1,2,3,4,5,6,7] in the user_id.txt and id,country.txt files. 为简化起见,我在user_id.txt和id,country.txt文件中创建了用户ID [1,2,3,4,5,6,7]。 You have some problems in your code: 您的代码中存在一些问题:

1- First you add some nodes to the graph (for instance from the user_id.txt file) then you draw it, then you add some other nodes to the graph from another file then you re-draw the whole graph again on the same figure. 1-首先将一些节点添加到图形中(例如从user_id.txt文件中),然后绘制它,然后从另一个文件向图形添加一些其他节点,然后在同一图形上再次重新绘制整个图形。 So, in the end you have many graph in one figure. 所以,最后你在一个图中有很多图形。

2- You used the draw_circular method to draw twice, that is why the blue nodes never appeared as they are overwritten by the 'red' nodes. 2-您使用draw_circular方法绘制两次,这就是为什么蓝色节点从未出现,因为它们被“红色”节点覆盖。

I have made some changes to your code to draw only one time in the end. 我对您的代码进行了一些更改,最后只绘制了一次。 And to draw nodes with the needed colors, I added an attribute called colors when adding nodes. 为了绘制具有所需颜色的节点,我在添加节点时添加了一个名为colors的属性。 Then I used this attribute to build a color map which I sent to draw_networkx function. 然后我使用这个属性来构建一个我发送到draw_networkx函数的颜色映射。 Finally, adding edges was a bit tricky because of the empty field in the id,country.txt so I had to remove empty nodes before creating the graph. 最后,添加边缘有点棘手,因为id,country.txt中的空字段因此我必须在创建图形之前删除空节点。 Here is the code and the figure that appears afterwards. 这是后来出现的代码和图。

G=nx.Graph()

#extract and add AGE_GROUP nodes in graph
f1 = csv.reader(open("user_agegroup.txt","rb"))
for row in f1: 
    G.add_nodes_from(row, color = 'blue')

#extract and add COUNTRY nodes in graph
f2 = csv.reader(open('user_country.txt','rb'))
for row in f2:
    G.add_nodes_from(row, color = 'red') 

#extract and add USER_ID nodes in graph
f3 = csv.reader(open('user_id.txt','rb'))
for row in f3:
    G.add_nodes_from(row, color = 'yellow')

f4 = csv.reader(open('id,agegroup.txt','rb'))
for row in f4:
    if len(row) == 2 : # add an edge only if both values are provided
        G.add_edge(row[0],row[1])

f5 = csv.reader(open('id,country.txt','rb'))

for row in f5:
    if len(row) == 2 : # add an edge only if both values are provided
        G.add_edge(row[0],row[1])
# Remove empty nodes
for n in G.nodes():
    if n == '':
        G.remove_node(n)
# color nodes according to their color attribute
color_map = []
for n in G.nodes():
    color_map.append(G.node[n]['color'])
nx.draw_networkx(G, node_color = color_map, with_labels = True, node_size = 500)

plt.savefig("path.png")

plt.show()

在此输入图像描述

You can use a for like:您可以使用 for like:

for a,b in df_edges.iterrows():
    G.add_edges_from([(b['source'], b['target'])])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM