NetworkX csv边列表结构

Question

Is there a standard structure for adding edges from a csv/txt into NetworkX? 是否存在用于将csv / txt中的边添加到NetworkX中的标准结构？ I've read the docs and have tried using read_edgelist('path.csv') and add_edges_from('path.csv') but have received errors saying my data cannot be converted into dictionaries, and also "Edge tuple C be a 2-tuple or a 3-tuple". 我已经阅读了文档，并尝试使用read_edgelist('path.csv')和add_edges_from('path.csv')但是收到错误消息，说我的数据无法转换成字典，并且“ Edge元组C为2元组或3元组”。 I've reformatted a sample of my data several ways to test different structures including lists of lists and lists of tuples, removing white space and also creating a single list of numbers in each row, but no luck. 我已经以多种方式对数据样本进行了重新格式化，以测试不同的结构，包括列表列表和元组列表，删除空格以及在每行中创建一个数字列表，但是没有运气。 Below is some sample data of mine: 以下是我的一些示例数据：

user_id,cluster_moves
11011,"[[86, 110], [110, 110]]"
2139671,"[[89, 125]]"
3945641,"[[36, 73], [73, 110], [110, 110]]"
10024312,"[[123, 27], [27, 97], [97, 97], [97, 97], [97,110]]"
14270422,"[[0, 110], [110, 174]]"
14283758,"[[110, 184]]"
14373703,"[[35, 97], [97, 97], [97, 97], [97, 17], [17,58]]"

The purpose is to create a network graph of trajectories moving between (or within) clusters. 目的是创建在群集之间（或群集内部）移动的轨迹的网络图。 Each list is a move either within a cluster, or between a cluster, eg, [[0, 110], [110,174]] is a move from clusters 0->110->174 . 每个列表是在集群内或集群之间的移动，例如[[0, 110], [110,174]]是从集群0->110->174 。 Is there a way to format my data such that networkx might be able to read it? 有没有一种方法可以格式化我的数据，以便networkx能够读取它？

Quick sample code I was testing data with: 我正在使用以下数据测试数据的快速示例代码：

import networkx as nx
import matplotlib.pyplot as plt

g = nx.Graph()
edges = g.add_edges_from('path.csv')

nx.draw(g)
plt.draw
plt.show()

Edit 编辑

Is it possible to add edge weights to this data structure when reading in networkx , and then adjust the weight based on the count/frequency of an edge? 在networkx读取时，是否可以向该数据结构添加边缘权重，然后根据边缘的计数/频率调整权重？ I would like to do this so I can visualize edges that have a higher frequency/count as another color/line weight. 我想这样做，以便可以将具有较高频率/计数的边缘可视化为另一种颜色/线条粗细。 Using the answer below, I have tried using g.add_weighted_edges_from() and using weight=1 as an attribute instead of using g.add_edges_from() , but this did not work properly. 使用以下答案，我尝试使用g.add_weighted_edges_from()并使用weight=1作为属性，而不是使用g.add_edges_from() ，但这无法正常工作。 I also tried using this with no luck: 我也尝试过使用它，但没有运气：

for u,v,d in g.edges():
    d['weight'] = 1
g.edges(data=True)
edges = g.edges()
weights = [g[u][v]['weight'] for u,v in edges]

Answer 1

First of all, your data is not valid csv file, from Comma separated values 首先，您的数据不是有效的csv文件，以逗号分隔的值

Fields with embedded commas or double-quote characters must be quoted. 带有嵌入式逗号或双引号字符的字段必须用引号引起来。

Which means you should use double-quote to quote your list: 这意味着您应该使用双引号来引用列表：

user_id,cluster_moves
11011,"[[86, 110], [110, 110]]"
2139671,"[[89, 125]]"
3945641,"[[36, 73], [73, 110], [110, 110]]"
10024312,"[[123, 27], [27, 97], [97, 97], [97, 97], [97,110]]"
14270422,"[[0, 110], [110, 174]]"
14283758,"[[110, 184]]"
14373703,"[[35, 97], [97, 97], [97, 97], [97, 17], [17,58]]"

And you can use csv module to read this file, and then convert the string to list by using eval() and create a network graph with add_edges_from : 然后，您可以使用csv模块读取此文件，然后使用eval()将字符串转换为list并使用add_edges_from创建网络图：

import csv
import networkx as nx
import matplotlib.pyplot as plt

g = nx.Graph()
for row in csv.reader(open('ooo.csv', 'r')):
    if '[' in row[1]:       #
        g.add_edges_from(eval(row[1]))

nx.draw(g)
plt.draw
plt.show()

NetworkX csv边列表结构

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-04-10 13:30:42

NetworkX csv边列表结构

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-04-10 13:30:42

解决方案1
1 已采纳 2017-04-10 13:30:42