简体   繁体   English

是否有 Numpy 函数将数组的索引映射到稀疏向量?

[英]Is there a Numpy function to map index of an array into a sparse vector?

I have a data structure like this:我有一个这样的数据结构:

  • my source arrays are a sorted arrays like [2,3,4,5,7,8,9,10,11]我的源数组是一个排序数组,如[2,3,4,5,7,8,9,10,11]
  • I know a priori the max number of this array collection, in this case it's 17我先验地知道这个数组集合的最大数量,在这种情况下它是 17

What I need to do is to build a sparse matrix with 17 rows (the max number mentioned above) and n cols where n is the number of arrays, and each column vector should contain a mapping of the index+1 of the source vector in position [value of the source vector's element], and 0 when it's not present.我需要做的是构建一个具有 17 行(上面提到的最大数量)和 n cols 的稀疏矩阵,其中 n 是数组的数量,并且每个列向量应包含源向量的 index+1 的映射位置 [源向量元素的值],不存在时为 0。 In the mentioned example the output vector should be [0,1,2,3,4,0,5,6,7,8,9,10,11,0,0,0,0] .在上述示例中,输出向量应为[0,1,2,3,4,0,5,6,7,8,9,10,11,0,0,0,0] Is there an efficient way to do that in numpy without having to loop through cols and rows which would have a dramatic computational cost?有没有一种有效的方法可以在 numpy 中做到这一点,而不必循环遍历会产生巨大计算成本的 cols 和 rows?

from scipy import sparse
import numpy as np

in_list = [2,3,4,5,7,8,9,10,11]
in_list_len = len(in_list)
max_num = 17
a = sparse.csr_matrix((max_num, in_list_len), dtype=np.int)

for ind, val in enumerate(in_list):
    a[val, ind] = ind + 1

and

Out[23]: array([[0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0, 0, 0], [0, 2, 0, 0, 0, 0, 0, 0, 0], [0, 0, 3, 0, 0, 0, 0, 0, 0], [0, 0, 0, 4, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 5, 0, 0, 0, 0], [0, 0, 0, 0, 0, 6, 0, 0, 0], [0, 0, 0, 0, 0, 0, 7, 0, 0], [0, 0, 0, 0, 0, 0, 0, 8, 0], [0, 0, 0, 0, 0, 0, 0, 0, 9], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0]])

Complexity is O(len(in_list))复杂度为O(len(in_list))

Your desired output makes no sense, because you asked for a matrix but specified a list.您想要的输出毫无意义,因为您要求提供一个矩阵但指定了一个列表。
I am pretty sure this is what you wanted.我很确定这就是你想要的。

The closest would be最接近的是

a.data Out[18]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])

This functionality appears to exist in pandas :此功能似乎存在于 pandas 中

matteos_sorted_arrays_with_nans = mateos_sorted_arrays
matteos_sorted_arrays[2:-2] = np.nan
sdf = pd.Series(pd.SparseArray(matteos_sorted_arrays_with_nans))

Without further particulars, I haven't the foggiest what to recommend as a next step, though.没有进一步的细节,我没有最模糊的下一步推荐什么,不过。

Using x to assign consecutive numbers to elements of an array:使用x将连续数字分配给数组的元素:

In [16]: x
Out[16]: array([ 2,  3,  4,  5,  7,  8,  9, 10, 11])
In [17]: arr = np.zeros(17,int)
In [18]: arr
Out[18]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [19]: arr[x-1] = np.arange(1,len(x)+1)
In [20]: arr
Out[20]: array([0, 1, 2, 3, 4, 0, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM