简体   繁体   English

如何按索引将ffdf子集化?

[英]How to subset ffdf by index?

I would like to subset an ffdf object by index, returning another ffdf object. 我想按索引子集一个ffdf对象,返回另一个ffdf对象。

The help file on subset.ffdf indicates that you can pass a range index (ri) object as an argument, but when I tried: 有关subset.ffdf帮助文件 ,您可以将范围索引(ri)对象作为参数传递,但是当我尝试执行以下操作时:

data_subset <- subset.ffdf(data, ri(1, 1e5))

I got this error: 我收到此错误:

Error in which(eval(e, nl, envir)) : argument to 'which' is not logical

Per You-Leee's suggestion, I tried passing a logical vector of the index of interest with this code: 根据You-Leee的建议,我尝试使用此代码传递感兴趣的索引的逻辑向量:

n <- length(data[[1]]) #10.5 million
logical_index = c(1, 1e5) == seq.int(1, n)
data_subset <- subset(data, logical_index)

I tried to run it twice and each time my R-Studio crashed with the message R encountered a fatal error. The session was terminated. 我尝试运行两次,每次我的R-Studio崩溃并显示消息R encountered a fatal error. The session was terminated. R encountered a fatal error. The session was terminated. At first I thought it might be a memory constraint, but looking at my activity monitor, I still have 4gb available out of 8gb. 起初我以为这可能是内存限制,但是从我的活动监视器来看,我仍然有8GB可用的4GB。 And besides, this shouldn't be loading much into memory anyway. 此外,这也不应该过多地加载到内存中。

The argument has to be logical, so you have to put TRUE on the desired indices and FALSE otherwise: 该参数必须是逻辑上的,因此您必须在所需的索引上输入TRUE,否则输入FALSE:

> data <- ffdf(a = ff(1:12))
> subset.ffdf(data, c(1, 1e5) == seq.int(1, length(data$a)))
ffdf (all open) dim=c(1,1), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
  PhysicalName VirtualVmode PhysicalVmode  AsIs VirtualIsMatrix     PhysicalIsMatrix
a            a      integer       integer FALSE           FALSE                FALSE
  PhysicalElementNo PhysicalFirstCol PhysicalLastCol PhysicalIsOpen
a                 1                1               1           TRUE
ffdf data
  a
1 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM