I would like to subset an ffdf object by index, returning another ffdf object.
The help file on subset.ffdf indicates that you can pass a range index (ri) object as an argument, but when I tried:
data_subset <- subset.ffdf(data, ri(1, 1e5))
I got this error:
Error in which(eval(e, nl, envir)) : argument to 'which' is not logical
Per You-Leee's suggestion, I tried passing a logical vector of the index of interest with this code:
n <- length(data[[1]]) #10.5 million
logical_index = c(1, 1e5) == seq.int(1, n)
data_subset <- subset(data, logical_index)
I tried to run it twice and each time my R-Studio crashed with the message R encountered a fatal error. The session was terminated.
R encountered a fatal error. The session was terminated.
At first I thought it might be a memory constraint, but looking at my activity monitor, I still have 4gb available out of 8gb. And besides, this shouldn't be loading much into memory anyway.
The argument has to be logical, so you have to put TRUE on the desired indices and FALSE otherwise:
> data <- ffdf(a = ff(1:12))
> subset.ffdf(data, c(1, 1e5) == seq.int(1, length(data$a)))
ffdf (all open) dim=c(1,1), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix PhysicalIsMatrix
a a integer integer FALSE FALSE FALSE
PhysicalElementNo PhysicalFirstCol PhysicalLastCol PhysicalIsOpen
a 1 1 1 TRUE
ffdf data
a
1 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.