简体   繁体   中英

Basic R: Subsetting DF by Columns with Logical Vector

I have a data frame, trainSmall , with six columns.

> trainSmall
     chr      pos      end LCR gc.50  type
  1:  22 39491638 39491639   0     0 del_L
  2:  22 29434028 29434029   0     0   ins
  3:  22 28347247 28347248   0     0 del_R
  4:  22 40121931 40121932   0     0   ins
  5:  22 39122351 39122352   0     0 del_L
 ---                                      
768:  22 27869380 27869381   0     0 del_R
769:  22 28823159 28823160   0     0   ins
770:  22 24319557 24319558   0     0 del_R
771:  22 38570330 38570331   0     0 del_L
772:  22 48182139 48182140   0     0 del_L
> is.data.frame(trainSmall)
[1] TRUE

I also have a vector, excl , with four items.

> excl
[1] "chr"  "pos"  "end"  "type"

I would like to take all rows of trainSmall , but only the columns not in excl . So I tried

> trainSmall[, !colnames(trainSmall) %in% excl]
[1] FALSE FALSE FALSE  TRUE  TRUE FALSE

But this just gives me another logical vector, not the actual rows from the data frame.

Even doing

> trainSmall[, c(F,F,F,T,T,F)]
[1] FALSE FALSE FALSE  TRUE  TRUE FALSE

doesn't work as I expected.

I'm pretty confused because this seems to be the method advocated in many places (like this answer ) for subsetting a data frame. What am I doing wrong?

Response to possible duplicate flag : None of the solutions there seem to work in this case.

> trainSmall[, -which(names(trainSmall) %in% excl)]
[1] -1 -2 -3 -6
> trainSmall[ , !names(trainSmall) %in% excl]
[1] FALSE FALSE FALSE  TRUE  TRUE FALSE

You could go for (note the parentheses):

df[, !(colnames(df) %in% excl)]

Another fun way would be to make an operator yourself (doing the opposite of %in% ):

excl <- c("chr", "pos", "end", "type")

'%!in%' <- function(x,y)!('%in%'(x,y))
mask <- colnames(df) %!in% excl
df[,mask]

Both will yield

   LCR gc.50
1:   0     0
2:   0     0
3:   0     0
4:   0     0
5:   0     0

Given the output of your code, I think your data are in data.table format (data table have both data frame and data table as their class). So, this should work:

trainSmall[, !excl, with = FALSE]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM