I have a data table with two columns named based on variables. I'm a touch new to the quirks of the data.tables package, but I've gotten something like the following code to work so far...
varNames <- c("Subtype", ...)
for (i in length(varNames)) {
nm1 <- (paste0(varNames[i],"1"))
nm2 <- (paste0(varNames[i],"2"))
DT[,(nm1):= x1]
DT[,(nm2):= x2]
#A BUNCH OF OTHER CODE GOES HERE...
}
I want to single out the rows where columns named nm1 and columns named nm2 match, but I know I can't just do this...
nmMatch <- (paste0(varNames[i],"Match"))
DT[, (nmMatch) := F ]
DT[(nm1)==(nm2), (nmMatch) := T] #Returns empty data table :^(
I think this is either because there are no columns actually named "nm1" or "nm2" or because the variable named nm1 does not equal the variable named nm2.
If I didn't need to assign these based on a vector of character values, I would write this to get what I'm looking for...
DT[, "SubtypeMatch" := F]
DT[(Subtype1) == (Subtype2), SubtypeMatch := T]
How do I get a subset of rows based on column values if I need to reference those column names through variables? Is there a way to do that for data tables? These end up being huge structures (> 1000000 rows), so any work arounds using sapply() end up being prohibitively slow.
I recognize that there may be ways that I could fundamentally restructure my code so that I never really need to do this, and I'm happy to hear those, but I'm also interested in any "Proper" way to accomplish this subsetting task with data.tables.
Use get
:
library(data.table)
DT[, (nmMatch) := FALSE ]
DT[get(nm1)== get(nm2), (nmMatch) := TRUE]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.