简体   繁体   English

从逻辑数组掩码(MATLAB)映射的另一个较小矩阵的值构建稀疏矩阵?

[英]Building a sparse matrix from another smaller matrix's values mapped by a logical array mask (MATLAB)?

I am working with large sparse matrices ( sparse ). 我正在使用大型稀疏矩阵( 稀疏 )。 I have a large sparse matrix of values which needs to be included into a larger sparse matrix. 我有一个大的稀疏值矩阵,需要包含在一个更大的稀疏矩阵中。 I have an array of logicals which indicates which rows and columns are to be filled up with the smaller matrix values. 我有一个logicals数组,指示哪些行和列将用较小的矩阵值填充。 In this application the smaller of the two matrices is a graph stored as an adjacency matrix and the logicals indicate the node id positions in the larger matrix. 在该应用中,两个矩阵中较小的一个是存储为邻接矩阵的图,逻辑表示较大矩阵中的节点id位置。

A small toy example to demonstrate what I am doing currently: 一个小玩具示例,用于演示我目前正在做的事情:

zz = sparse(4,4);  %create a sparse matrix of the final size desired
rr = rand(3,3);   %a matrix of values
logical_inds = logical([1 0 1 1]); %a logical array to index with the dimension size of zz
zz(logical_inds,logical_inds) = rr(:,:) %'rr' values are mapped into the subset of 'zz'

I see that the 2 nd column of zz are zeros, and that the 2 nd row are zero values as well. 我看到zz 第二列是零, 第二行也是零值。 This is the output desired. 这是所需的输出。

In my program get a warning that this "sparse indexing is likely to be slow", and it is. 在我的程序中得到一个警告,这个“稀疏索引可能会很慢”,而且确实如此。 Occasionally when the matrices are very large the program terminates at this line. 有时,当矩阵非常大时,程序终止于此行。

How can I create this matrix ( zz ) with the sparse method? 如何使用稀疏方法创建此矩阵( zz )? I am unsure how to create the row column indexes from the mask of logicals I have, and how to turn the values of rr into an array ordered appropriately for this new indexing. 我不确定如何从我拥有的逻辑掩码创建行列索引,以及如何将rr的值转换为适合此新索引的数组。

**in general rr is very sparse although the mask of logicals addresses the full matrix **一般来说rr非常稀疏,尽管逻辑掩码解决了整个矩阵

To create this matrix with the sparse function the logical indices will need to be converted into row and column indices, so this may end up being slower... 要使用sparse函数创建此矩阵,逻辑索引将需要转换为行索引和列索引, 因此最终可能会变慢...

Here the locations of ones in the logical vector are found and then a matrix is created containing the row and column indices for the non zeros in the sparse matrix. 这里找到逻辑向量中的1的位置,然后创建包含稀疏矩阵中非零的行和列索引的矩阵。
Finally the sparse function is used to create the sparse matrix with the elements of rr in these locations ( rr(:) is used to convert it into a column vector) 最后,稀疏函数用于在这些位置创建具有rr元素的稀疏矩阵( rr(:)用于将其转换为列向量)

ind_locs = find(logical_inds);
ind = combvec(ind_locs,ind_locs);

zz = sparse(ind(1,:),ind(2,:),rr(:))

I think the problem is mostly due to implicit resizing during the allocation. 我认为问题主要是由于分配期间隐式调整大小。 Here's why I think that: 这就是为什么我认为:

%# test parameters
N  = 5000;               %# Size of 1 dimension of the square sparse
L  = rand(1,N) > 0.95;   %# 5% of rows/cols will be non-zero values
M  = sum(L);            
rr = rand(M);            %# the "data" to fill the sparse with 


%# Method 1: direct logical indexing
%# (your original method)
zz1 = sparse(N,N);    
tic    
    zz1(L,L) = rr;
toc

%# Method 2: test whether the conversion to logical col/row indices matters 
zz2  = sparse(N,N);    
inds = zz1~=0;    
tic        
    zz2(inds) = rr;
toc

%# Method 3: test whether the conversion to linear indices matters 
zz3 = sparse(N,N);
inds = find(inds);
tic        
    zz3(inds) = rr;
toc

%# Method 4: test whether implicit resizing matters    
zz4 = spalloc(N,N, M*M);
tic        
    zz4(inds) = rr;
toc

Results: 结果:

Elapsed time is 3.988558 seconds. %# meh   M1 (original)
Elapsed time is 3.916462 seconds. %# meh   M2 (expanded logicals)
Elapsed time is 4.003222 seconds. %# meh   M3 (converted row/col indices)
Elapsed time is 0.139986 seconds. %# WOW!  M4 (pre-allocated memory)

So apparently (and amazingly), it would seem that MATLAB does not grow the existing sparse before the allocation (as you'd expect), but actually loops through the row/col indices and grows the sparse during the iteration. 显然(并且令人惊讶地),似乎MATLAB在分配之前不会增加现有的稀疏(正如您所期望的那样),但实际上循环遍历行/ col索引并迭代期间增加稀疏。 Therefore, it would seem one has to "help" MATLAB a bit: 因此,似乎必须“帮助”MATLAB:

%# Create initial sparse
zz1 = sparse(N,N);    

%# ...
%# Do any further operations until you can create rr: 
%# ...

rr = rand(M);            %# the "data" to fill the sparse with 


%# Now that the size of the data is known, re-allocate space for the sparse:
tic
    [i,j] = find(zz1); %# indices
    [m,n] = size(zz1); %# Sparse size (you can also use N of course)
    zz1 = sparse(i,j,nonzeros(zz1), m,n, M*M);
    zz1(L,L) = rr; %# logical or integer indices, doesn't really matter 
toc

Results (for the same N , L and rr ): 结果(对于相同的NLrr ):

Elapsed time is 0.034950 seconds.  %# order of magnitude faster than even M4!

See also this question+answers . 另见这个问题+答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM