简体   繁体   中英

An efficient way to grow a data frame using a by function

I need to do some analysis on vehicles that are identified by their ID . The results of this analysis will include some numeric , factor , and logical information. All the data used in the analysis is in one data frame, so that the function goes like this:

Results <- by(Data, Data$ID, Function)

Where Function is designed to give output like this:

Function <- function(DF) {
                          ## Do stuff...
                          return(c(23.2, as.factor("SuperFast"), TRUE))
                         }

What's been great about this approach so far is that in addition to being quite fast (taking ~1 min where a for loop took hours), it's easy to put in data.frame format by:

as.data.frame(do.call("rbind", Results))

But of course, c in Function and "rbind" in do.call coerce everything into the same object type. To resolve this, I've been making Function spit out a character vector (like as.character(23.2, "SuperFast", TRUE) and then changing object types manually at the end.

Is there (1) a way to return something that can be a row in a dataframe that has different object types or (2) a better method than using by and c (for rows)?

Just for kicks, here's something that can be used for Data:

Data <- data.frame(ID=c(1,2,2,3))

Just return a data frame instead of a vector from your function:

Function <- function(DF) {
    ## Do stuff...
    return(data.frame(a = 23.2,b = as.factor("SuperFast"),c = TRUE))
}

As an aside, the only thing coercing everything to the same data type was c . rbind has a data frame method that will (mostly) preserve types, assuming all the data frames you pass to it line up.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM