避免在numpy操作中隱式轉換為矩陣

Question

有沒有辦法全局避免matrix出現在numpy計算的任何結果中？ 例如，如果您將x作為numpy.ndarray而y作為scipy.sparse.csc_matrix ，並且您說x += y ，則x將在之后成為matrix 。 有沒有辦法防止這種情況發生，即保持x是一個ndarray ，更一般地說，在生成matrix所有地方繼續使用ndarray ？

Answer 1

我添加了scipy標簽，這是一個scipy.sparse問題，而不是一個np.matrix問題。

In [250]: y=sparse.csr_matrix([[0,1],[1,0]])
In [251]: x=np.arange(2)
In [252]: y+x
Out[252]: 
matrix([[0, 2],
        [1, 1]])

稀疏+數組=>矩陣

（作為一個側面說明， np.matrix是的一個子類np.ndarray 。 sparse.csr_matrix不是一個子類。它有許多numpy的類似操作，但它在自己的代碼實現它們）。

In [255]: x += y
In [256]: x
Out[256]: 
matrix([[0, 2],
        [1, 1]])

從技術上講，這不應該發生; 實際上它正在做x = x+y為x分配一個新值，而不僅僅是修改x 。

如果我首先將y轉換為常規密集matrix ，我會收到錯誤。 允許該操作會將1d數組更改為2d數組。

In [258]: x += y.todense()
...
ValueError: non-broadcastable output operand with shape (2,) doesn't match the broadcast shape (2,2)

將x更改為2d允許繼續添加 - 無需將數組更改為矩陣：

In [259]: x=np.eye(2)
In [260]: x
Out[260]: 
array([[ 1.,  0.],
       [ 0.,  1.]])
In [261]: x += y.todense()
In [262]: x
Out[262]: 
array([[ 1.,  1.],
       [ 1.,  1.]])

通常，使用稀疏矩陣執行加法/減法是棘手的。 它們是為矩陣乘法而設計的。 乘法不會像添加那樣改變稀疏性。 例如， y+1使其變得密集。

如果沒有深入研究如何對稀疏加法進行編碼的細節，我會說 - 在沒有先將y轉換為密集版本的情況下，不要嘗試這個x+=...操作。

In [265]: x += y.A
In [266]: x
Out[266]: 
array([[ 1.,  2.],
       [ 2.,  1.]])

我想不出一個不這樣做的好理由。

（我應該檢查scipy github上的bug問題）。

scipy / sparse / compressed.py有csr加法代碼。 x+y使用x.__add__(y)但有時會將其翻轉為y.__add__(x) 。 x+=y使用x.__iadd__(y) 。 所以我可能還需要為ndarray檢查__iadd__ 。

但是稀疏矩陣的基本補充是：

def __add__(self,other):
    # First check if argument is a scalar
    if isscalarlike(other):
        if other == 0:
            return self.copy()
        else:  # Now we would add this scalar to every element.
            raise NotImplementedError('adding a nonzero scalar to a '
                                      'sparse matrix is not supported')
    elif isspmatrix(other):
        if (other.shape != self.shape):
            raise ValueError("inconsistent shapes")

        return self._binopt(other,'_plus_')
    elif isdense(other):
        # Convert this matrix to a dense matrix and add them
        return self.todense() + other
    else:
        return NotImplemented

所以y+x變成y.todense() + x 。 並且x+y使用相同的東西。

無論+=細節，很明顯將稀疏添加到密集（數組或np.matrix）涉及將稀疏轉換為密集。 沒有代碼可以遍歷稀疏值並將這些值有選擇地添加到密集數組中。

只有當數組稀疏時它才會執行特殊的稀疏加法。 y+y工作，返回稀疏。 y+=y失敗， NotImplmenentedError從sparse.base.__iadd__ 。

這是我提出的最佳診斷序列，嘗試了將y添加到(2,2)陣列的各種方法。

In [348]: x=np.eye(2)
In [349]: x+y
Out[349]: 
matrix([[ 1.,  1.],
        [ 1.,  1.]])
In [350]: x+y.todense()
Out[350]: 
matrix([[ 1.,  1.],
        [ 1.,  1.]])

加法產生一個矩陣，但值可以寫入x而不改變x類（或形狀）

In [351]: x[:] = x+y
In [352]: x
Out[352]: 
array([[ 1.,  1.],
       [ 1.,  1.]])

+=用密集矩陣做同樣的事情：

In [353]: x += y.todense()
In [354]: x
Out[354]: 
array([[ 1.,  2.],
       [ 2.,  1.]])

但是+=sparse中的某些東西改變了x的類

In [355]: x += y
In [356]: x
Out[356]: 
matrix([[ 1.,  3.],
        [ 3.,  1.]])

進一步測試並查看id(x)和x.__array_interface__ ，很明顯x += y代替x 。 即使x以np.matrix開頭也是np.matrix 。 所以稀疏+=不是一個就地操作。 x += y.todense()是一個就地操作。

Answer 2

是的，這是一個錯誤; 但是https://github.com/scipy/scipy/issues/7826說

我真的沒有辦法改變這種狀況。

todense X += c * Y沒有todense的X += c * Y
一些inc( various array / matrix, various sparse )已經過測試，但肯定不是全部。

 def inc( X, Y, c=1. ): """ X += c * Y, XY sparse or dense """ if (not hasattr( X, "indices" ) # dense += sparse and hasattr( Y, "indices" )): # inc an ndarray view, because ndarry += sparse -> matrix -- X = getattr( X, "A", X ).squeeze() X[Y.indices] += c * Y.data else: X += c * Y # sparse + different sparse: SparseEfficiencyWarning return X

避免在numpy操作中隱式轉換為矩陣

問題描述

2 個解決方案

解決方案1
2 2016-02-22 07:45:16

解決方案2
0 2017-09-13 10:05:39

避免在numpy操作中隱式轉換為矩陣

問題描述

2 個解決方案

解決方案1 2 2016-02-22 07:45:16

解決方案2 0 2017-09-13 10:05:39

解決方案1
2 2016-02-22 07:45:16

解決方案2
0 2017-09-13 10:05:39