简体   繁体   中英

armadillo: Get nonzero locations of sparse row vector from sparse matrix

I am using armadillo and R through RcppArmadillo.

I have a sparse matrix and a row number as input. I would like to search the corresponding row of the matrix and return the location of the nonzeros.

So far my function looks like

// [[Rcpp::export]]
arma::uvec findAdjacentStates(arma::sp_mat adjacency, int row) {
  arma::uvec out(adjacency.n_cols);
  arma::SpSubview<double>::const_iterator start = adjacency.row(row).begin();
  arma::SpSubview<double>::const_iterator end = adjacency.row(row).end();

  for(arma::SpSubview<double>::const_iterator it = start; it != end; ++it)
  {
    Rcout << "location: " << it.row() << "," << it.col() << "  ";
    Rcout << "value: " << (*it) << std::endl;
  }
  return out;
}

which is based on a previous SO answer .

The function crashes R.

require(Matrix)
x = rsparsematrix(10, 10, .2)
x = x > 1
x = as(x, "dgCMatrix")
findAdjacentStates(x, 1)

One thing that is not clear to me is how to iterate on a row vector specifically; the previous SO answer was for iterating on a sparse matrix.

Update: according to valgrind the problem is in operator++ (SpSubview_iterators_meat.hpp:319), so it seems this is not the correct way to iterate on a sparse row vector

The way to iterate on a sparse matrix row is with a sp_mat::row_iterator . Unfortunately, there's no way to know ahead of time what size your output vector would be and uvec objects don't have a push_back like regular vectors do. Here would be my suggestion:

#include <RcppArmadillo.h>

// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp;
using namespace arma;

// [[Rcpp::export]]
IntegerVector findAdjacentStates(sp_mat adjacency, int row) {
    IntegerVector out;
    sp_mat::const_row_iterator start = adjacency.begin_row(row);
    sp_mat::const_row_iterator end = adjacency.end_row(row);
    for ( sp_mat::const_row_iterator i = start; i != end; ++i )
    {
        out.push_back(i.col());
    }
    return out;
}

Which we can test out easily enough:

# We need Rcpp and Matrix:
library(Rcpp)
library(Matrix)
# This is the C++ script I wrote:
sourceCpp('SOans.cpp')
# Make example data (setting seed for reproducibility):
set.seed(123)
x = rsparsematrix(10, 10, .2)
# And test the function:
findAdjacentStates(x, 1)

[1] 4

x

10 x 10 sparse Matrix of class "dgCMatrix"

 [1,]  .    .  0.84 .  0.40  0.7 .  .     .    -0.56
 [2,]  .    .  .    . -0.47  .   .  .     .     .   
 [3,]  .    .  .    .  .     .   .  .    -2.00  .   
 [4,]  0.15 .  .    .  .     .   .  .     .    -0.73
 [5,]  1.80 .  .    .  .     .   .  .     .     .   
 [6,]  .    .  .    .  .     .   .  .     0.11  .   
 [7,]  .    . -1.10 .  .     .   . -1.70 -1.10  .   
 [8,]  .    .  .    .  .     .   .  1.30  .    -0.22
 [9,] -0.63 .  1.20 .  .     .   .  0.36  .     .   
[10,]  .    .  .    .  0.50 -1.0 .  .     .     .   

So, we can see this works; row 1 (in C++ terms; row 2 in R terms) has only one non-zero element, which is in column 4 (in C++ terms; column 5 in R terms). This should work if you're wanting to return the output to R. If you're wanting to use the output in another C++ function, depending on what you're doing you may prefer to have a uvec rather than an IntegerVector , but you can just convert the IntegerVector to a uvec (probably not the most efficient solution, but the best I thought of right now):

// [[Rcpp::export]]
uvec findAdjacentStates(sp_mat adjacency, int row) {
    IntegerVector tmp;
    sp_mat::const_row_iterator start = adjacency.begin_row(row);
    sp_mat::const_row_iterator end = adjacency.end_row(row);
    for ( sp_mat::const_row_iterator i = start; i != end; ++i )
    {
        tmp.push_back(i.col());
    }
    uvec out = as<uvec>(tmp);
    return out;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM