How can I get do.call (namespace:base) and rbindlist (namespace:data.table) to behave the same. rbindlist eliminates factor levels while do.call does not. The following shows the issue
(dataList <- list(data.frame(f1=rep(c("a"), each=1),"c"=rnorm(2),"d"=rnorm(2)),
data.frame(f1=rep(c("b"), each=1),"c"=rnorm(2),"d"=rnorm(2))) )
(rbindlist.Data <- rbindlist(dataList)) # combines lists into ONE data.frame same as above
(do.call.Data <- do.call(rbind, dataList))
It's true that rbindlist
doesn't deal well with factors.
Notice that the internal representation of "a" in dataList[[1]]$f1
and the internal representation of "b" in dataList[[2]]$f1
are both 1
; verify this using str(dataList)
. Unfortunately, rbindlist
will combine the internal representations; verify this using str(rbindlist.Data)
.
The solution is to rbindlist
character columns, and not factor columns , unless you're sure the factor columns use exactly the same factor representation (with the same levels and labels). One way to do this is to use data.table
consistently:
(dataList <- list(data.table(f1=rep(c("a"), each=1),"c"=rnorm(2),"d"=rnorm(2)),
data.table(f1=rep(c("b"), each=1),"c"=rnorm(2),"d"=rnorm(2))) )
(rbindlist.Data <- rbindlist(dataList))
produces the desired result, because data.table
won't convert strings to factors.
You could use your original code with stringsAsFactors = FALSE
(either in the data.frame
call or using options
). I wouldn't recommend this, though, as there's no harm (and much benefit) in using data.table
from the beginning.
If you aren't making the data.frame
yourself, you'll have to convert the column types. It's not hard with a data.table
call; see Convert column classes in data.table .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.