如何使用排列數組有效地排列稀疏（Numpy）矩陣中的行？

Question

我使用Scipy Reverse Cuthill-McKee實現（ scipy.sparse.csgraph.reverse_cuthill_mckee ）使用（高維）稀疏csr_matrix創建帶矩陣。 這種方法的結果是一個排列數組，據我所知，它為我提供了如何排列矩陣行的索引。

現在，是否有任何有效的解決方案可對其他稀疏矩陣（csr，lil_matrix等）中的稀疏csr_matrix進行此排列？ 我嘗試了一個for循環，但矩陣的尺寸約為200,000 x 150,000，這需要太多時間。

A = csr_matrix((data,(rowind,columnind)), shape=(200000, 150000), dtype=np.uint8)

permutation_array = csgraph.reverse_cuthill_mckee(A, false)

result_matrix = lil_matrix((200000, 150000), dtype=np.uint8)

i=0
for x in np.nditer(permutation_array):
    result_matrix[x, :]=A[i, :]
    i+=1

reverse_cuthill_mckee調用的結果是一個數組，就像一個包含我排列的索引的tupel一樣。 因此，此數組類似於：[199999 54877 54873 ...，12045 9191 0]（大小= 200,000）

這意味着：索引為0的行現在具有索引199999，索引為1的行現在具有索引54877，索引2的行現在具有索引54873，以此類推。請參見： https ://en.wikipedia.org/wiki/Permutation#Definition_and_notations（據我了解的回報）

謝謝

Answer 1

我想知道您是否正確應用了置換數組。

制作一個隨機矩陣（浮點數）並將其轉換為uint8 （請注意， csr計算可能不適用於此dtype）：

In [963]: ran=sparse.random(10,10,.3, format='csr')
In [964]: A = sparse.csr_matrix((np.ones(ran.data.shape).astype(np.uint8),ran.indices, ran.indptr))
In [965]: A.A
Out[965]: 
array([[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
       [0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
       [1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
       [0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 1, 1, 0, 0, 0]], dtype=uint8)

（糟糕，此處使用了錯誤的矩陣）：

In [994]: permutation_array = csgraph.reverse_cuthill_mckee(A, False)
In [995]: permutation_array
Out[995]: array([9, 7, 0, 4, 6, 3, 5, 1, 8, 2], dtype=int32)

我的第一個傾向是使用這樣的數組來簡單地索引原始矩陣的行：

In [996]: A[permutation_array,:].A
Out[996]: 
array([[0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
       [1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
       [1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
       [0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
       [0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)

我看到一些聚類； 也許我們可以從隨機矩陣中得到最好的結果。

另一方面，您似乎在做：

In [997]: res = sparse.lil_matrix(A.shape,dtype=A.dtype)
In [998]: res[permutation_array,:] = A
In [999]: res.A
Out[999]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
       [0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
       [1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
       [0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 1, 0, 0, 0]], dtype=uint8)

我看不到res的1s聚類有任何改善。

相當於MATLAB的文檔說

r = symrcm（S）返回S的對稱反向Cuthill-McKee排序。這是一個置換r，因此S（r，r）傾向於使其非零元素更接近對角線。

在numpy條款，這意味着：

In [1019]: I,J=np.ix_(permutation_array,permutation_array)
In [1020]: A[I,J].A
Out[1020]: 
array([[0, 0, 0, 1, 1, 0, 1, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 1, 0, 1, 0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0, 0, 1, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 0, 0, 0, 1, 0, 0],
       [0, 1, 1, 0, 0, 0, 1, 0, 0, 1],
       [0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 1, 1, 1, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)

實際上，在2個對角線的拐角處還有更多的0帶。

並使用MATLAB頁面https://www.mathworks.com/help/matlab/ref/symrcm.html上的帶寬計算

In [1028]: i,j=A.nonzero()
In [1029]: np.max(i-j)
Out[1029]: 7
In [1030]: i,j=A[I,J].nonzero()
In [1031]: np.max(i-j)
Out[1031]: 5

MATLAB文檔說，通過這種排列，特征值保持不變。 測試：

In [1032]: from scipy.sparse import linalg
In [1048]: linalg.eigs(A.astype('f'))[0]
Out[1048]: 
array([ 3.14518213+0.j        , -0.96188843+0.j        ,
       -0.58978939+0.62853903j, -0.58978939-0.62853903j,
        1.09950364+0.54544497j,  1.09950364-0.54544497j], dtype=complex64)
In [1049]: linalg.eigs(A[I,J].astype('f'))[0]
Out[1049]: 
array([ 3.14518023+0.j        ,  1.09950352+0.54544479j,
        1.09950352-0.54544479j, -0.58978981+0.62853914j,
       -0.58978981-0.62853914j, -0.96188819+0.j        ], dtype=complex64)

我們先前嘗試的行排列的特征值不同：

In [1050]: linalg.eigs(A[permutation_array,:].astype('f'))[0]
Out[1050]: 
array([ 2.95226836+0.j        , -1.60117996+0.52467293j,
       -1.60117996-0.52467293j, -0.01723826+1.06249797j,
       -0.01723826-1.06249797j,  0.90314150+0.j        ], dtype=complex64)
In [1051]: linalg.eigs(res.astype('f'))[0]
Out[1051]: 
array([-0.05822830-0.97881651j, -0.99999994+0.j        ,
        1.17350495+0.j        , -0.91237622+0.8656373j ,
       -0.91237622-0.8656373j ,  2.26292515+0.j        ], dtype=complex64)

此[I,J]置換適用於http://ciprian-zavoianu.blogspot.com/2009/01/project-bandwidth-reduction.html中的示例矩陣

In [1058]: B = np.matrix('1 0 0 0 1 0 0 0;0 1 1 0 0 1 0 1;0 1 1 0 1 0 0 0;0 0 0 
      ...: 1 0 0 1 0;1 0 1 0 1 0 0 0; 0 1 0 0 0 1 0 1;0 0 0 1 0 0 1 0;0 1 0 0 0 
      ...: 1 0 1')
In [1059]: B
Out[1059]: 
matrix([[1, 0, 0, 0, 1, 0, 0, 0],
        [0, 1, 1, 0, 0, 1, 0, 1],
        [0, 1, 1, 0, 1, 0, 0, 0],
        [0, 0, 0, 1, 0, 0, 1, 0],
        [1, 0, 1, 0, 1, 0, 0, 0],
        [0, 1, 0, 0, 0, 1, 0, 1],
        [0, 0, 0, 1, 0, 0, 1, 0],
        [0, 1, 0, 0, 0, 1, 0, 1]])
In [1060]: Bm=sparse.csr_matrix(B)
In [1061]: Bm
Out[1061]: 
<8x8 sparse matrix of type '<class 'numpy.int32'>'
    with 22 stored elements in Compressed Sparse Row format>
In [1062]: permB = csgraph.reverse_cuthill_mckee(Bm, False)
In [1063]: permB
Out[1063]: array([6, 3, 7, 5, 1, 2, 4, 0], dtype=int32)
In [1064]: Bm[np.ix_(permB,permB)].A
Out[1064]: 
array([[1, 1, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0],
       [0, 0, 1, 1, 1, 1, 0, 0],
       [0, 0, 0, 0, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 1, 1]], dtype=int32)

如何使用排列數組有效地排列稀疏（Numpy）矩陣中的行？

問題描述

1 個解決方案

解決方案1
1 2017-08-17 15:38:04

如何使用排列數組有效地排列稀疏（Numpy）矩陣中的行？

問題描述

1 個解決方案

解決方案1 1 2017-08-17 15:38:04

解決方案1
1 2017-08-17 15:38:04