简体   繁体   English

从 CSV 文件创建 Networkx Graph

[英]Create Networkx Graph from CSV file

I am trying to build a NetworkX social network graph from a CSV file .我正在尝试从CSV 文件构建 NetworkX 社交网络图。 I am using Networkx 2.1 and Python 3我正在使用 Networkx 2.1 和 Python 3

I followed this post with no luck because I keep receiving the error:我没有运气就关注了这篇文章,因为我一直收到错误消息:

AttributeError: 'list' object has no attribute 'decode'

My goal is to make the weights display thicker edges for the higher weights.我的目标是让权重显示出更厚的边缘以适应更高的权重。

Here is my code so far:到目前为止,这是我的代码:

import networkx as nx
import csv

Data  = open('testest.csv', "r", encoding='utf8')
read = csv.reader(Data)
Graphtype=nx.Graph()   # use net.Graph() for undirected graph

G = nx.read_edgelist(read, create_using=Graphtype, nodetype=int, data=(('weight',float),))

for x in G.nodes():
      print ("Node:", x, "has total #degree:",G.degree(x), " , In_degree: ", G.out_degree(x)," and out_degree: ", G.in_degree(x))   
for u,v in G.edges():
      print ("Weight of Edge ("+str(u)+","+str(v)+")", G.get_edge_data(u,v))

nx.draw(G)
plt.show()

Is there a more simplified way to approach this?有没有更简化的方法来解决这个问题? The data is relatively simple.数据比较简单。

Thank you for your help!谢谢您的帮助!

You are misusing the function read_edgelist .您正在滥用函数read_edgelist From the documentation , each line needs to be parsed a string, while csv.reader parses the lines in the input file into lists of strings (for example, 202,237,1 -> ['202', '237', '1'] ).文档中,每一行都需要解析一个字符串,而csv.reader将输入文件中的行解析为字符串列表(例如, 202,237,1 -> ['202', '237', '1'] )。 Therefore, AttributeError is raised because read_edgelist is trying to parse the lists provided by csv.reader , while they should be strings.因此,引发AttributeError是因为read_edgelist正在尝试解析csv.reader提供的列表,而它们应该是字符串。

We can correctly parse the graph from the input file without using the csv module.我们可以在不使用csv模块的情况下正确解析输入文件中的图形。 However, we still need to deal with the first line (the headers) of the input file, which should not be parsed.但是,我们仍然需要处理不应该解析的输入文件的第一行(标题)。 There are two methods.有两种方法。 The first method skip the first line using next :第一种方法使用next跳过第一行:

Data = open('test.csv', "r")
next(Data, None)  # skip the first line in the input file
Graphtype = nx.Graph()

G = nx.parse_edgelist(Data, delimiter=',', create_using=Graphtype,
                      nodetype=int, data=(('weight', float),))

The second method is a bit "hacky": since the first line starts with target , we mark the character t as the start of a comment in the input file.第二种方法有点“hacky”:由于第一行以target开头,我们将字符t标记为输入文件中注释的开头。

Data = open('test.csv', "r")
Graphtype = nx.Graph()

G = nx.parse_edgelist(Data, comments='t', delimiter=',', create_using=Graphtype,
                      nodetype=int, data=(('weight', float),))

In both methods, we have to use parse_edgelist instead of read_edgelist because the input file uses \r for newlines.在这两种方法中,我们都必须使用parse_edgelist而不是read_edgelist ,因为输入文件使用\r作为换行符。 To use read_edgelist , the file needs to be opened in binary mode, whose lines are split iff the newlines are either \r\n or \n .要使用read_edgelist ,需要以二进制模式打开文件,如果换行符是\r\n\n ,则其行被拆分。 Thus the input file with \r newlines cannot be split into lines, and thus cannot parsed correctly.因此,带有\r换行符的输入文件无法拆分为行,因此无法正确解析。

Also, since you want to find the in-degrees and out-degrees, the graph should be created using DiGraph , not Graph .此外,由于您想找到入度和出度,因此应使用DiGraph而不是Graph创建图形。

Edit编辑

The key point here is to skip the header in the input file.这里的关键点是跳过输入文件中的标题。 We can achieve this by first reading the input file into a pandas.DataFrame , then we convert it to a graph.我们可以通过首先将输入文件读入pandas.DataFrame来实现这一点,然后将其转换为图形。

import networkx as nx
import pandas as pd

df = pd.read_csv('test.csv')
Graphtype = nx.Graph()
G = nx.from_pandas_edgelist(df, edge_attr='weight', create_using=Graphtype)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM