简体   繁体   中英

Convert data frame to list

I am trying to go from a data frame to a list structure in R (and I know technically a data frame is a list). I have a data frame containing reference chemicals and their mechanisms different targets. For example, estrogen is an estrogen receptor agonist. What I would like is to transform the data frame to a list, because I am tired of typing out something like:

refchem$chemical_id[refchem$target=="AR" & refchem$mechanism=="Agonist"]

every time I need to access the list of specific reference chemicals. I would much rather access the chemicals by:

refchem$AR$Agonist

I am looking for a general answer, even though I have given a simplified example, because not all targets have all mechanisms.

This is really easy to accomplish with a loop:

example <- data.frame(target=rep(c("t1","t2","t3"),each=20),
                      mechan=rep(c("m1","m2"),each=10,3),
                      chems=paste0("chem",1:60))
oneoption <- list()
for(target in unique(example$target)){
  oneoption[[target]] <- list()
  for(mech in unique(example$mechan)){
    oneoption[[target]][[mech]] <- as.character(example$chems[ example$target==target & example$mechan==mech ])
  }
}

I am just wondering if there is a more clever way to do it. I tried playing around with lapply and did not make any progress.

Using split :

split(refchem, list(refchem$target, refchem$mechanism))

Should do the trick.

The new way to access would be refchem$AR.Agonist

If you make a keyed data.table instead, ...

  • you'll still have all the data in one data.frame (instead of a possibly-nested list of many);
  • you may find iterating over these subsets nicer; and
  • the syntax is pretty clean:

To access a subset:

DT[.('AR','Agonist')] 

To do something for each group, that will be rbind ed together in the result:

DT[,{do stuff},by=key(DT)]

Similar to aggregate() , any list of vectors of the correct length can go into the by , not just the key.

Finally, DT came from...

 require(data.table)
 DT <- data.table(refchem,key=c('target','mechanism'))

You can also use a plyr function:

library(plyr)
dlply(example, .(target, mechan))

It has the added advantage of using a function to process the data, if needed (there's an implicit identity in the above).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM