简体   繁体   中英

How to subset ffdf by index?

I would like to subset an ffdf object by index, returning another ffdf object.

The help file on subset.ffdf indicates that you can pass a range index (ri) object as an argument, but when I tried:

data_subset <- subset.ffdf(data, ri(1, 1e5))

I got this error:

Error in which(eval(e, nl, envir)) : argument to 'which' is not logical

Per You-Leee's suggestion, I tried passing a logical vector of the index of interest with this code:

n <- length(data[[1]]) #10.5 million
logical_index = c(1, 1e5) == seq.int(1, n)
data_subset <- subset(data, logical_index)

I tried to run it twice and each time my R-Studio crashed with the message R encountered a fatal error. The session was terminated. R encountered a fatal error. The session was terminated. At first I thought it might be a memory constraint, but looking at my activity monitor, I still have 4gb available out of 8gb. And besides, this shouldn't be loading much into memory anyway.

The argument has to be logical, so you have to put TRUE on the desired indices and FALSE otherwise:

> data <- ffdf(a = ff(1:12))
> subset.ffdf(data, c(1, 1e5) == seq.int(1, length(data$a)))
ffdf (all open) dim=c(1,1), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
  PhysicalName VirtualVmode PhysicalVmode  AsIs VirtualIsMatrix     PhysicalIsMatrix
a            a      integer       integer FALSE           FALSE                FALSE
  PhysicalElementNo PhysicalFirstCol PhysicalLastCol PhysicalIsOpen
a                 1                1               1           TRUE
ffdf data
  a
1 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM