简体   繁体   English

在python中创建nxn对称二进制数据矩阵

[英]Creating a nxn Symmetric binary data matrix in python

I want to create a nxn symmetric matrix in python. 我想在python中创建一个nxn对称矩阵。 Lets say n=9, then I want something like below: 假设n = 9,那么我想要以下内容:

array[[0,1,0,0,0,1,1,0,1],[1,0,1,1,1,0,0,0,0],[0,1,0,1,1,0,0,0,0]….]. 

I know how to do this by first creating a nun zeros matrix in python (np.zeros((9,9)) and then using a loop to populate it 1 and zeros. But I feel that is not a pythonic way. So was looking for an optimised way using loops would slow the code if the matrix is big. 我知道如何做到这一点,首先要在python(np.zeros((9,9)))中创建一个nun零矩阵,然后使用循环将其填充为1和零。但是我觉得这不是Python方式。如果矩阵很大,寻找使用循环的优化方法会使代码变慢。

Basically it's the adjacency matrix I am creating for an undirected graph. 基本上,这是我为无向图创建的邻接矩阵。 My follow-up question would be how to plot the graph for which one has an adjacency matrix. 我的后续问题将是如何绘制具有邻接矩阵的图。 Any functions which plot undirected graph from adjacency matrix? 从邻接矩阵绘制无向图的函数吗?

Please advise. 请指教。 I wanted to learn the best optimised/pythonic way of doing something in python rather than using traditional loops. 我想学习在python中做某事的最佳优化/ pythonic方式,而不是使用传统循环。

EDIT: 编辑:

I used the following to create a edge list for a 30x30 adjacency matrix. 我使用以下内容为30x30邻接矩阵创建边缘列表。 But this edge list doesn't have pairs for each node in a cluster. 但是,此边缘列表在群集中没有每个节点对。 If I start doing that the list would be huge. 如果我开始这样做的话,这份清单将是巨大的。 My graph below consequently doesn't have edges between each node in a cluster. 因此,下面的图在群集中的每个节点之间都没有边缘。 How to automate this edge list so that I don't have to manually type all edge pairs. 如何自动执行此边缘列表,这样我就不必手动键入所有边缘对。 In the graph I want each node in a cluster to have an edge with other node in that cluster and only node 1 and 2 should have between cluster edge with node 16 and 17 of other cluster. 在图中,我希望集群中的每个节点都具有与该集群中其他节点的一条边,并且只有节点1和2在集群边缘与其他集群的节点16和17之间应具有一条边。

N=30
# Creating a matrix of zeros. 
W=np.zeros((N,N))
# Mentioning the edges to start with. Thinking of a pair of 15 node cluster with two cluster connected by two pair of nodes. 
edge=[[1,2],[1,3],[1,4],[1,5],[1,6],[1,7],[1,8],[1,9],[1,10],[1,11],[1,12],[1,13],[1,14],[1,15],
      [16,17],[16,18],[16,19],[16,20],[16,21],[16,22],[16,23],[16,24],[16,25],[16,26],[16,27],[16,28],[16,29],[16,30],
      [1,16],[2,17]]

# Function for creating adjacency matrix ,populating the zeros matrix with 1 and 0-signifying edges on a node. 
def adjacencyMatrix():
    """This function creates an Adjacency Matrix from a edge set. 
    input-> set of edges to be connected 
    output-> Adjacency matrix (n,n)
    """
    for first,second in edge:
        W[first-1,second-1]=W[second-1][first-1]=1

Graph: 图形:

在此处输入图片说明

If all you care about is having the graph and an adjacency matrix, do you have to build the graph from the matrix? 如果您只关心具有图和邻接矩阵,则是否需要从矩阵构建图? Or are you happy to do it the other way around instead? 还是您乐意反过来做呢?

You should look at networkx . 您应该看看networkx

Bearing in mind the comment; 牢记评论; you have a set of edges - you know these in advance (or at least how you want to create them - and you want to plot the graph. Now, you could create the adjacency matrix separately if you wanted, something like this: 您有一组边线-您已经预先知道了这些边线(或者至少要如何创建它们- 并且想要绘制图形)。现在,您可以根据需要单独创建邻接矩阵,如下所示:

A = [[0 for _ in range(N)] for _ in range(N)]
edges = [[1,2], [3,4], [6,1], ..., etc.]
for start, finish in edges: 
  A[start][finish] = A[finish][start] = 1

And then you could then just do the plotting as below - but why would you want to do this when you would be getting all that functionality from networkx anyway? 然后,您可以按照以下方式进行绘制-但是为什么无论如何要从networkx获得所有功能, networkx You create an adjecency matrix by telling it what edges you have - the graph and the adjacency matrix hold exactly the same information, just in different formats, it makes no differences which way you do it (and it could be argued that doing it by adding edges to the graph is more readable too). 通过告诉邻接矩阵具有什么边来创建邻接矩阵-图形和邻接矩阵以完全不同的格式保存完全相同的信息,这样做的方式没有区别(并且可以说通过添加图的边缘也更易于阅读)。

From your edit, you want to have two clusters of nodes, and then to have all nodes within each cluster joined to each other, and then a couple of extra edges. 从您的编辑中,您想要有两个节点集群,然后要使每个集群中的所有节点相互连接,然后要有两个额外的边。 You mention that it would be tedious to do this manually, and you're right: so do it programatically. 您提到手动执行此操作很繁琐,而且您是对的:以编程方式执行此操作。

import networkx as nx
from matplotlib import pyplot

G=nx.Graph()

# Why not group your nodes into clusters, since that's how you plan on using them.
node_clusters = [range(10), range(10,20)]


for node_cluster in node_clusters:
  for node in node_cluster:
    for other_node in node_cluster:
      if node != other_node:
        G.add_edge(node, other_node) # we don't actually need to add nodes, as the `add_edge` will add the nodes for us. 

#Add manual edges
G.add_edge(0,10)
G.add_edge(1, 11)


from networkx.linalg.graphmatrix import adjacency_matrix
A = adjacency_matrix(G)
print A

nx.draw(G)

pyplot.show()

在此处输入图片说明

Honestly though, if every node in each cluster is connected to each other, there's not really a huge amount of point drawing all the connections, summarising them instead as on larger node might make a nicer drawing. 坦白说,如果每个群集中的每个节点都相互连接,则实际上并没有大量的点绘制所有连接,而是对它们进行汇总,就像在较大的节点上绘制出更好的图一样。

Adjacency matrices are usually sparse ( nnz ~ O(N) ), thus they are usually stored in sparse format . 邻接矩阵通常是稀疏的( nnz ~ O(N) ),因此它们通常以稀疏格式存储。 The simplest one is coo format that is basically three arrays: [row_ids, col_id, value] , crs and csc are somewhat more complex to get used to, but have higher performance. 最简单的一种是coo格式,基本上是三个数组: [row_ids, col_id, value] ,crs和csc习惯起来比较复杂,但性能更高。 The core benefit of using sparse representation is that when you start performing lets say, matvec you would get a huge speed-up usually (asymptotic complexity is lower under assumption nnz ~ O(N) ). 使用稀疏表示的核心好处是,当您开始执行matvec时,通常可以大大提高速度(在假定nnz ~ O(N)情况下,渐进复杂度较低)。

Answering your question: you can build matrix with ones in positions pos = [(1, 4), (3, 2)] like this: 回答您的问题:您可以使用位置pos = [(1, 4), (3, 2)]矩阵构建矩阵,如下所示:

M = scipy.sparse.coo_matrix(([1]*len(pos), zip(*pos)))

which is in my view pretty much Pythonic :) 我认为这几乎是Pythonic的:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM