简体   繁体   English

在python中创建一个邻接矩阵

[英]create an adjacency matrix in python

I want to load CSV or text file of signed (weighted) graph and create an adjacency matrix. 我要加载CSV或带符号(加权)图的文本文件,并创建一个邻接矩阵。 The CSV file contains three columns named "FromNodeId", "ToNodeId" and "Sign". CSV文件包含名为“ FromNodeId”,“ ToNodeId”和“ Sign”的三列。 The code I used is as follows: 我使用的代码如下:

G = nx.read_edgelist('soc-sign-epinions.txt', data = [('Sign', int)])
#print(G.edges(data = True))

A = nx.adjacency_matrix(G)
print(A.todense())

I encountered the following error 我遇到以下错误

ValueError: array is too big; `arr.size * arr.dtype.itemsize` is larger than 
the maximum possible size

How can I solve this problem? 我怎么解决这个问题? Please suggest me a way to create the adjacency matrix. 请为我建议一种创建邻接矩阵的方法。

The memory needed to store a big matrix can easily get out of hand, which is why nx.adjacency_matrix(G) returns a "sparse matrix" which is stored more efficiently (exploiting that many entries will be 0). 存储大矩阵所需的内存很容易失控,这就是为什么nx.adjacency_matrix(G)返回“稀疏矩阵”的原因,该矩阵存储效率更高(利用许多条目将为0)。

Since your graph has 131000 vertices, the whole adjacency matrix will use around 131000^2 * 24 bytes (an integer takes 24 bytes of memory in python), which is about 400GB. 由于您的图形具有131000个顶点,因此整个邻接矩阵将使用大约131000^2 * 24 bytes (整数在python中占用24字节的内存),大约为400GB。 However, your graph has less than 0.01% of all edges, in other words it is very sparse and sparse matrices will work for you. 但是,您的图的所有边小于0.01%,换言之,它非常稀疏,稀疏矩阵将为您工作。

In order to get the sparse matrix, just use A = nx.adjacency_matrix(G) without calling A.todense() after it (this tries to store it normally again). 为了获得稀疏矩阵,只需使用A = nx.adjacency_matrix(G)而不在其后调用A.todense() (这将尝试再次正常存储它)。

There is an inbuild function of scipy.sparse to efficiently save and load sparse matrices, see here . 有一个inbuild功能scipy.sparse有效地保存和载入稀疏矩阵,看这里 For example, to save your sparse matrix A, use 例如,要保存稀疏矩阵A,请使用

scipy.sparse.save_npz('filename.npz', A)

If it is important for you to use txt or CSV, you will have to do it manually. 如果对使用txt或CSV来说很重要,则必须手动进行操作。 This can be done by iterating through every row of your matrix and writing these one by one to your file: 这可以通过遍历矩阵的每一行并将它们一一写入到文件中来完成:

for i in range(A.shape[0]): row = A.getrow(i).todense() [write row to file using your preferred method]

This might take a few minutes to run, but should work (I tested with a path of the same size). 这可能需要几分钟才能运行,但是应该可以工作(我使用相同大小的路径进行了测试)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM