简体   繁体   中英

Subset dataframe with list of columns in R

I want to select all columns in my dataframe which I have stored in a string variable. For example:

v1 <- rnorm(100)
v2 <- rnorm(100)
v3 <- rnorm(100)
df <- data.frame(v1,v2,v3)

I want to accomplish the following:

df[,c('v1','v2')]

But I want to use a variable instead of (c('v1', 'v2'))(these all fail):

select.me <- "'v1','v2'"
df[,select.me]
df[,c(select.me)]
df[,c(paste(select.me,sep=''))]

Thanks for help with a simple question,

The great irony here is that when you said "I want to do this" the first expression should have succeeded,

df[,c('v1','v2')]
> str( df[,c('v1','v2')] )
'data.frame':   100 obs. of  2 variables:
 $ v1: num  -0.3347 0.2113 0.9775 -0.0151 -1.8544 ...
 $ v2: num  -1.396 -0.95 -1.254 0.822 0.141 ...

whereas all the later attempts would fail. I later realized that you didn't know that you could use select.me <- c('v1','v2') ; df[ , select.me] select.me <- c('v1','v2') ; df[ , select.me] . You could also use these forms which might be safer in some instances:

df[ , names(df) %in% select.me] # logical indexing
df[ , grep(select.me, names(df) ) ]  # numeric indexing
df[ , grepl(select.me, names(df) ) ]  # logical indexing

Any of those can be used with negation( !logical ) or minus ( -numeric ) to retrieve the complement, whereas you cannot use character indexing with negation. If you wanted to go down one level in understandability and were willing to change the select.me values to a valid R expression you could do this:

select.me <- "c('v1','v2')"
df[ , eval(parse(text=select.me)) ]

Not that I recommend this... just to let you know that such is possible after you "learn to walk". It would also have been possible (although rather baroque) using your original quoted string to pull out the information (although I think this just illustrates why your first version is superior):

select.me <- "'v1','v2'"
df [ , scan(textConnection(select.me), what="", sep=",") ]
> str( df [ , scan(textConnection(select.me), what="", sep=",") ] )
Read 2 items
'data.frame':   100 obs. of  2 variables:
 $ v1: num  -0.3347 0.2113 0.9775 -0.0151 -1.8544 ...
 $ v2: num  -1.396 -0.95 -1.254 0.822 0.141 ...

This is basic R sytnax, perhaps you need to read the introductory manual

select.me <- c('v1','v2')
df[,select.me]

你是说这个吗?

dat <- cbind(df$v1, df$v2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM