从 label python 列表创建关联矩阵的快速方法？

Question

I have an array y, len(y) = M that contains values from 0 -> N .我有一个数组y, len(y) = M ，其中包含0 -> N的值。 For example, with N = 3 :例如，当N = 3时：

y = [0, 2, 0, 1, 2, 1, 0, 2]

Incidence matrix A is defined as followed:发生矩阵A定义如下：

Size MxM尺寸MxM
A(i,j) = 1 if y(i) == y(j)
A(i,j) = 0 if y(i) != y(j)

A simple algorithm would be:一个简单的算法是：

def incidence(y):
    M = len(y)
    A = np.zeros((M,M))
    for i in range(M):
        for j in range(M):
            if y[i]==y[j]:
                A[i,j] = 1
    return A

But this is very slow.但这非常慢。 Is there any way to do this faster?有什么办法可以更快地做到这一点？ Using list comprehension or vectorization, for example.例如，使用列表理解或向量化。

Answer 1

You can take advantage of numpy broadcasting to gain some efficiency here over our python by simply asking if y equals its transpose:您可以利用 numpy 广播在我们的 python 上获得一些效率，只需询问y是否等于它的转置：

import numpy as np

y = np.array([1, 2, 1, 0, 0, 1, 2])

def mat_me(y):
    return (y == y.reshape(-1, 1)).astype(int)

mat_me(y)

which produces:产生：

array([[1, 0, 1, 0, 0, 1, 0],
       [0, 1, 0, 0, 0, 0, 1],
       [1, 0, 1, 0, 0, 1, 0],
       [0, 0, 0, 1, 1, 0, 0],
       [0, 0, 0, 1, 1, 0, 0],
       [1, 0, 1, 0, 0, 1, 0],
       [0, 1, 0, 0, 0, 0, 1]])

for comparison:为了比较：

y = np.random.choice([1, 2, 3], size=3000)

def mat_me_py(y):
    return (y == y.reshape([-1, 1])).astype(int)

%timeit mat_me_py(y)  
# 28.6 ms ± 1.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

vs.对比

y = np.random.choice([1, 2, 3], size=3000)
y = list(y)

def mat_me_py(y):
    return [[int(a == b) for a in y] for b in y]

%timeit mat_me_py(y)
# 4.16 s ± 213 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The difference will become very pronounced on larger lists.在较大的列表中，差异将变得非常明显。

从 label python 列表创建关联矩阵的快速方法？

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-12-19 04:47:51

从 label python 列表创建关联矩阵的快速方法？

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-12-19 04:47:51

解决方案1
1 已采纳 2020-12-19 04:47:51