简体   繁体   English

scipy.sparse.hstack(([1],[2])) - >“ValueError:blocks必须是2-D”。 为什么?

[英]scipy.sparse.hstack(([1], [2])) -> “ValueError: blocks must be 2-D”. Why?

scipy.sparse.hstack((1, [2])) and scipy.sparse.hstack((1, [2])) work well, but not scipy.sparse.hstack(([1], [2])) . scipy.sparse.hstack((1, [2]))scipy.sparse.hstack((1, [2]))良好,但不是scipy.sparse.hstack(([1], [2])) Why is this the case? 为什么会这样?

Here is a trace of what's happening on my system: 以下是我系统上发生的情况:


C:\Anaconda>python
Python 2.7.10 |Anaconda 2.3.0 (64-bit)| (default, May 28 2015, 16:44:52) [MSC v.
1500 64 bit (AMD64)] on win32
>>> import scipy.sparse
>>> scipy.sparse.hstack((1, [2]))
<1x2 sparse matrix of type '<type 'numpy.int32'>'
        with 2 stored elements in COOrdinate format>
>>> scipy.sparse.hstack((1, 2))
<1x2 sparse matrix of type '<type 'numpy.int32'>'
        with 2 stored elements in COOrdinate format>
>>> scipy.sparse.hstack(([1], [2]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\scipy\sparse\construct.py", line 456, in h
stack
    return bmat([blocks], format=format, dtype=dtype)
  File "C:\Anaconda\lib\site-packages\scipy\sparse\construct.py", line 539, in b
mat
    raise ValueError('blocks must be 2-D')
ValueError: blocks must be 2-D
>>> scipy.version.full_version
'0.16.0'
>>>

In the first case of scipy.sparse.hstack((1, [2])) , the number 1 is interpreted as a scalar value and the number 2 is interpreted as a dense matrix, and so when you combine these two things together, the data types are coerced so that they are both scalars and you can combine this with scipy.sparse.hstack normally. scipy.sparse.hstack((1, [2]))的第一种情况下,数字1被解释为标量值,数字2被解释为密集矩阵,所以当你将这两个东西组合在一起时,数据类型是强制的,因此它们都是标量,你可以正常地将它与scipy.sparse.hstack结合起来。

Here's some more tests to show that this is true with multiple values: 这里有一些测试表明多个值都是如此:

In [31]: scipy.sparse.hstack((1,2,[3],[4]))
Out[31]: 
<1x4 sparse matrix of type '<type 'numpy.int64'>'
    with 4 stored elements in COOrdinate format>

In [32]: scipy.sparse.hstack((1,2,[3],[4],5,6))
Out[32]: 
<1x6 sparse matrix of type '<type 'numpy.int64'>'
    with 6 stored elements in COOrdinate format>

In [33]: scipy.sparse.hstack((1,[2],[3],[4],5,[6],7))
Out[33]: 
<1x7 sparse matrix of type '<type 'numpy.int64'>'

As you can see, if you have at least one scalar present in hstack , this seems to work. 正如你所看到的,如果你在hstack中至少有一个标量存在,这似乎有效。

However, when you try to do the second case of scipy.sparse.hstack(([1],[2])) , they aren't both scalars anymore and these are both dense matrices, and you can't use scipy.sparse.hstack with purely dense matrices. 但是,当你尝试第二种scipy.sparse.hstack(([1],[2])) ,它们不再是两个标量,而且它们都是密集矩阵,你不能使用scipy.sparse.hstack具有纯密集矩阵的scipy.sparse.hstack

To reproduce: 重现:

In [34]: scipy.sparse.hstack(([1],[2]))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-45-cd79952b2e14> in <module>()
----> 1 scipy.sparse.hstack(([1],[2]))

/usr/local/lib/python2.7/site-packages/scipy/sparse/construct.pyc in hstack(blocks, format, dtype)
    451 
    452     """
--> 453     return bmat([blocks], format=format, dtype=dtype)
    454 
    455 

/usr/local/lib/python2.7/site-packages/scipy/sparse/construct.pyc in bmat(blocks, format, dtype)
    531 
    532     if blocks.ndim != 2:
--> 533         raise ValueError('blocks must be 2-D')
    534 
    535     M,N = blocks.shape

ValueError: blocks must be 2-D

See this post for more insight: Scipy error with sparse hstack 有关更多信息,请参阅此文章: 稀疏hstack的Scipy错误

Therefore, if you want to use this successfully with two matrices, you must make them sparse first, then combine them: 因此,如果要在两个矩阵中成功使用它,则必须先将它们稀疏,然后将它们组合起来:

In [36]: A = scipy.sparse.coo_matrix([1])

In [37]: B = scipy.sparse.coo_matrix([2])

In [38]: C = scipy.sparse.hstack([A, B])

In [39]: C
Out[39]: 
<1x2 sparse matrix of type '<type 'numpy.int64'>'
    with 2 stored elements in COOrdinate format>

Interestingly enough, if you tried doing what you did with the dense version of hstack , or numpy.hstack , then it's perfectly acceptable: 有趣的是,如果您尝试使用密集版本的hstacknumpy.hstack ,那么它是完全可以接受的:

In [48]: import numpy as np

In [49]: np.hstack((1, [2]))
Out[49]: array([1, 2])

.... things muck up for sparse matrix representations ¯\\_(ツ)_/¯ . ....稀疏矩阵表示的东西¯\\_(ツ)_/¯

The coding details are: 编码细节是:

def hstack(blocks ...):
    return bmat([blocks], ...)

def bmat(blocks, ...):
    blocks = np.asarray(blocks, dtype='object')
    if blocks.ndim != 2:
        raise ValueError('blocks must be 2-D')
    (continue)

So trying your alternatives (remembering the extra [] ): 所以尝试你的替代品(记住额外的[] ):

In [392]: np.asarray([(1,2)],dtype=object)
Out[392]: array([[1, 2]], dtype=object)

In [393]: np.asarray([(1,[2])],dtype=object)
Out[393]: array([[1, [2]]], dtype=object)

In [394]: np.asarray([([1],[2])],dtype=object)
Out[394]: 
array([[[1],
        [2]]], dtype=object)

In [395]: _.shape
Out[395]: (1, 2, 1)

This last case (your problem case) failed because the result was 3d. 最后一种情况(您的问题案例)失败,因为结果是3d。

With 2 sparse matrices (expected input): 使用2个稀疏矩阵(预期输入):

In [402]: np.asarray([[a,a]], dtype=object) 
Out[402]: 
array([[ <1x1 sparse matrix of type '<class 'numpy.int32'>'
    with 1 stored elements in COOrdinate format>,
        <1x1 sparse matrix of type '<class 'numpy.int32'>'
    with 1 stored elements in COOrdinate format>]], dtype=object)

In [403]: _.shape
Out[403]: (1, 2)

hstack is taking advantage of the bmat format, by turning a list of matrices into a nested (2d) list of matrices. hstack利用bmat格式,将矩阵列表转换为嵌套(2d)矩阵列表。 bmat is meant to be a way of combining a 2d array of sparse matrices into one larger one. bmat意味着将二维稀疏矩阵阵列组合成一个较大的矩阵。 Skipping the step of first making these sparse matrices may, or might not, work. 跳过首先制作这些稀疏矩阵的步骤可能会或可能不会起作用。 The code and the documentation don't make any promises. 代码和文档没有做出任何承诺。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM