[英]NetworkX csv edgelist structure
Is there a standard structure for adding edges from a csv/txt into NetworkX? 是否存在用于将csv / txt中的边添加到NetworkX中的标准结构? I've read the docs and have tried using
read_edgelist('path.csv')
and add_edges_from('path.csv')
but have received errors saying my data cannot be converted into dictionaries, and also "Edge tuple C be a 2-tuple or a 3-tuple". 我已经阅读了文档,并尝试使用
read_edgelist('path.csv')
和add_edges_from('path.csv')
但是收到错误消息,说我的数据无法转换成字典,并且“ Edge元组C为2元组或3元组”。 I've reformatted a sample of my data several ways to test different structures including lists of lists and lists of tuples, removing white space and also creating a single list of numbers in each row, but no luck. 我已经以多种方式对数据样本进行了重新格式化,以测试不同的结构,包括列表列表和元组列表,删除空格以及在每行中创建一个数字列表,但是没有运气。 Below is some sample data of mine:
以下是我的一些示例数据:
user_id,cluster_moves
11011,"[[86, 110], [110, 110]]"
2139671,"[[89, 125]]"
3945641,"[[36, 73], [73, 110], [110, 110]]"
10024312,"[[123, 27], [27, 97], [97, 97], [97, 97], [97,110]]"
14270422,"[[0, 110], [110, 174]]"
14283758,"[[110, 184]]"
14373703,"[[35, 97], [97, 97], [97, 97], [97, 17], [17,58]]"
The purpose is to create a network graph of trajectories moving between (or within) clusters. 目的是创建在群集之间(或群集内部)移动的轨迹的网络图。 Each list is a move either within a cluster, or between a cluster, eg,
[[0, 110], [110,174]]
is a move from clusters 0->110->174
. 每个列表是在集群内或集群之间的移动,例如
[[0, 110], [110,174]]
是从集群0->110->174
。 Is there a way to format my data such that networkx might be able to read it? 有没有一种方法可以格式化我的数据,以便networkx能够读取它?
Quick sample code I was testing data with: 我正在使用以下数据测试数据的快速示例代码:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
edges = g.add_edges_from('path.csv')
nx.draw(g)
plt.draw
plt.show()
Edit 编辑
Is it possible to add edge weights to this data structure when reading in networkx
, and then adjust the weight based on the count/frequency of an edge? 在
networkx
读取时,是否可以向该数据结构添加边缘权重,然后根据边缘的计数/频率调整权重? I would like to do this so I can visualize edges that have a higher frequency/count as another color/line weight. 我想这样做,以便可以将具有较高频率/计数的边缘可视化为另一种颜色/线条粗细。 Using the answer below, I have tried using
g.add_weighted_edges_from()
and using weight=1
as an attribute instead of using g.add_edges_from()
, but this did not work properly. 使用以下答案,我尝试使用
g.add_weighted_edges_from()
并使用weight=1
作为属性,而不是使用g.add_edges_from()
,但这无法正常工作。 I also tried using this with no luck: 我也尝试过使用它,但没有运气:
for u,v,d in g.edges():
d['weight'] = 1
g.edges(data=True)
edges = g.edges()
weights = [g[u][v]['weight'] for u,v in edges]
First of all, your data is not valid csv
file, from Comma separated values 首先,您的数据不是有效的
csv
文件,以逗号分隔的值
Fields with embedded commas or double-quote characters must be quoted.
带有嵌入式逗号或双引号字符的字段必须用引号引起来。
Which means you should use double-quote to quote your list: 这意味着您应该使用双引号来引用列表:
user_id,cluster_moves
11011,"[[86, 110], [110, 110]]"
2139671,"[[89, 125]]"
3945641,"[[36, 73], [73, 110], [110, 110]]"
10024312,"[[123, 27], [27, 97], [97, 97], [97, 97], [97,110]]"
14270422,"[[0, 110], [110, 174]]"
14283758,"[[110, 184]]"
14373703,"[[35, 97], [97, 97], [97, 97], [97, 17], [17,58]]"
And you can use csv
module to read this file, and then convert the string to list by using eval()
and create a network graph with add_edges_from
: 然后,您可以使用
csv
模块读取此文件,然后使用eval()
将字符串转换为list并使用add_edges_from
创建网络图:
import csv
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
for row in csv.reader(open('ooo.csv', 'r')):
if '[' in row[1]: #
g.add_edges_from(eval(row[1]))
nx.draw(g)
plt.draw
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.