简体   繁体   中英

data.frame Select columns with condition

I have 2 questions but they are very similar:

1) I have a dataframe: test<-data.frame(NAT1=1,NAT2=2,NAT3=3,NAT_S1=4,NAT_S2=5) :

  NAT1 NAT2 NAT3 NAT_S1 NAT_S2
1    1    2    3      4      5

I want to select column which names start with NAT and end with a number:

  NAT1 NAT2 NAT3 
1    1    2    3   

2) Same thing I want to do with characters: colnames(test) I want to keep only : [1] "NAT1" "NAT2" "NAT3"

I suppose there are a function could do this. Thanks for your help

test[, grepl("^NAT\\d$", names(test))]
  NAT1 NAT2 NAT3
1    1    2    3

If you have more than 1 number after NAT , you can use the quantifier + or, if you want to select a certain number of numbers or a certain range of numbers, quantification with { } :

test[, grepl("^NAT\\d+$", names(test))] # matches one or more numbers
test[, grepl("^NAT\\d{2}$", names(test))] # matches exactly 2 numbers
test[, grepl("^NAT\\d{2,4}$", names(test))] # matches the range of 2 to 4 numbers
test[, grepl("^NAT\\d{2,}$", names(test))] # matches the range of 2 to any number of numbers

Alternatively, you can use str_detect from the package stringr :

test[, str_detect(colnames(test), "^NAT\\d$")]
  NAT1 NAT2 NAT3
1    1    2    3

With dplyr , we can use matches

library(dplyr)
test %>%
   select(matches("^NAT.*\\d+$"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM