简体   繁体   中英

Subsetting columns of an R matrix contingent on a specific entry

It seems as though this question should have been asked a bunch of times, but I'm searching the Questions that may already have your answer without success.

How do you subset with Boolean operators (without using subset() ) the columns of a matrix?

> m = matrix(c("A", "B", "B", "B", "C", "A", "C", "C", "D"), nrow = 3)
> m
     [,1] [,2] [,3]
[1,] "A"  "B"  "C" 
[2,] "B"  "C"  "C" 
[3,] "B"  "A"  "D" 

Notice that the columns have no names, and I want any columns that contain in some entry the value "D".

For instance, in this post , the call grades[grades[,"pass"] == 2,] . Aside from the fact that the call is to extract rows, and the fact that pass refers to a single column, there are no names for the columns.

I have tried:

> m[m == "D", ]
Error in m[m == "D", ] : (subscript) logical subscript too long

> m[which(m=="D"), ]
Error in m[which(m == "D"), ] : subscript out of bounds

> m = as.data.frame(m) # Turning the matrix into a df
> m[m == "D", ]
     V1   V2   V3
NA <NA> <NA> <NA>

You can use an apply call to search the column elements and index on those.

m[,apply(m, MARGIN = 2, function(x) any(x == "D")), drop = FALSE]

     [,1]
[1,] "C" 
[2,] "C" 
[3,] "D" 

Note - you will notice drop = FALSE argument is present. This is to make sure the output is still a matrix in the event there is only 1 column.

Here is an alternative.

m[, colSums(m == "D") > 0, drop=FALSE]
     [,1]
[1,] "C" 
[2,] "C" 
[3,] "D" 

m==D constructs a logical matrix, then colSums counts the number of TRUEs. Next, these are checked as to whether any are greater than 0. The result of this check is used to subset the matrix. Following @cdeterman's answer, I added drop=FALSE to preserve the matrix structure.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM