简体   繁体   中英

R: how to convert a sparse binary matrix represented by row-indexed lists to column-indexed lists

Assuming I have a sparse m by n binary matrix, and I already use a row-indexed lists to represent the ones. For example, the following 3 by 3 matrix

     [,1] [,2] [,3]
[1,]    1    1    0
[2,]    0    1    0
[3,]    0    0    1

is represented by a list M_row:

> M_row
[[1]]
[1] 1 2

[[2]]
[1] 2

[[3]]
[1] 3

Here the i-th element in the list corresponds to the positions of ones in the i-th row. I want to convert this list to a column-indexed list, where the j-th element in the new list corresponds to the (row) positions of ones in the j-th column. For the previous example, I want:

> M_col
[[1]]
[1] 1 

[[2]]
[1] 1 2

[[3]]
[1] 3

Is there an efficient way to do this without writing many loops?

Try this

M_row <- list(1:2 , 2, 3) # this is the beginning list

#----------------------------------
m <- matrix(0 , length(M_row) , length(M_row))

for(i in 1:nrow(m)) {
  m[ i , M_row[[i]]] <- 1
}
M_col <- apply(m , 2 , \(x) which(x == 1))

#----------------------------------
M_col   # this is the required list
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 1 2
#>
#> [[3]]
#> [1] 3

Here is an algorithm that doesn't create the matrix.

  1. Get the number of columns with sapply/max and create a results list M_col of the required length;
  2. for each input list member, update M_col by appending the row number to it.
M_row <- list(1:2 , 2, 3)

Max_col <- max(sapply(M_row, max))
M_col <- vector("list", length = Max_col)
for(i in seq_along(M_row)) {
  for(j in M_row[[i]]) {
    M_col[[j]] <- c(M_col[[j]], i)
  }
}
M_col
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 1 2
#> 
#> [[3]]
#> [1] 3

Created on 2022-06-19 by the reprex package (v2.0.1)

You could use stack + unstack :

M_row <- list(1:2 , 2, 3) # this is the beginning list
d <- type.convert(stack(setNames(M_row, seq_along(M_row))), as.is = TRUE)
d
  values ind
1      1   1
2      2   1
3      2   2
4      3   3

d is the row, column combinations where values represents the row while ind represents the columns:

columnwise:

unstack(d, ind~values)
$`1`
[1] 1

$`2`
[1] 1 2

$`3`
[1] 3

Rowwise:

unstack(d, values~ind)
$`1`
[1] 1 2

$`2`
[1] 2

$`3`
[1] 3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM