简体   繁体   中英

R accessing variable column names for subsetting

The following works and does what I want it to do:

    dat<-subset(data,NLI.1 %in% NLI)

However, I may need to subset via a different column (ie NLI.2 and NLI.3). I've tried

    NLI_col<-"NLI.1"
    NLI_col<-subset(data,select=NLI_col) 
    dat<-subset(data,NLI_col %in% NLI)

Unsurprisingly this doesn't work. How do I use NLI_col to achieve the result from the code that does work?

It was requested that I give an example of what data looks like. Here:

NLI.1<-c(NA,NA,NA,NA,NA,1,2,2,2,NA,2,2,2,2,2,2,2,NA,NA,2,2,2,2,NA,2,2,2,2,2,2,2,NA,NA,NA,NA,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,NA,2,2,2,2,2,2,2,2,2,2,NA,2,2,2,2,2,2,2,2,2,2,2,1,2,2,2,2,2,1,2,2,2,2,2,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,2,2,1,2,2,2)
NLI.2<-c(NA,NA,NA,NA,NA,NA,2,2,2,NA,NA,2,2,2,2,2,2,NA,2,2,2,2,2,NA,2,2,2,2,2,2,2,NA,NA,NA,NA,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,NA,2,2,2,2,2,2,2,2,2,2,2,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,2,2,2,2,2,2,1,2,2,2,2,2,2,2)
NLI.3<-c(NA,35,40,NA,10,NA,31,NA,14,NA,NA,15,17,NA,NA,16,10,15,14,39,17,35,14,14,22,10,15,0,34,23,13,35,32,2,14,10,14,10,10,10,40,10,13,13,10,10,10,13,13,25,10,35,NA,13,NA,10,40,0,0,20,40,10,14,40,10,10,10,10,13,10,8,NA,NA,14,NA,10,28,10,10,15,15,16,10,10,35,16,NA,NA,NA,NA,30,19,14,30,10,10,8,10,21,10,10,35,15,34,10,39,NA,10,10,6,16,10,10,10,10,34,10)
other<-c(NA,NA,511,NA,NA,NA,NA,NA,849,NA,NA,NA,NA,1324,1181,832,1005,166,204,1253,529,317,294,NA,514,801,534,1319,272,315,572,96,666,236,842,980,290,843,904,528,27,366,540,560,659,107,63,20,1184,1052,214,46,139,310,872,891,651,687,434,1115,1289,455,764,938,1188,105,757,719,1236,982,710,NA,NA,632,NA,546,747,941,1257,99,133,61,249,NA,NA,1080,NA,645,19,107,486,1198,276,777,738,1073,539,1096,686,505,104,5,55,553,1023,1333,NA,NA,969,691,1227,1059,358,991,1019,NA,1216)

data<-cbind(NLI.1,NLI.2,NLI.3,other)
NLI<-c(10,13)

With this, after sub-setting I should get all the rows with tens and thirteens in data$NLI.3 if NLI_col <- "NLI.3"

Since this is relatively trivial I am guessing this is a duplicate question (my apologies), but the hours drag on and I still cant find a solution

Seems like you are unnecessarily using subset . Try this:

NLI_col <- 'NLI.3'
head(data[,NLI_col] %in% NLI)
##  [1] FALSE FALSE FALSE FALSE  TRUE FALSE
head(data[data[,NLI_col] %in% NLI, ])
##     NLI.1 NLI.2 NLI.3 other
##  5     NA    NA    10    NA
##  17     2     2    10  1005
##  26     2     2    10   801
##  31     2     2    13   572
##  36     2     2    10   980
##  38     2     2    10   843

I'm not sure I am following the question exactly. Are you asking to just subset the rows of NLI.3 that contain a 10 or a 13? Is it more complicated than that?

If you just want to get those rows....

df[ which(df$NLI.3==10 | df$NLI.3==13 ),]

Assuming your data is in a dataframe. Also, I changed the name of the dataframe from 'data' to 'df' - calling it 'data' can lead to issues.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM