简体   繁体   English

R与ff和FSelector包

[英]R with ff and FSelector package

I have a 1360x92735 csv dataset and I have to reduce dimensionality using FSelector package for R (information.gain()), but it requests a lot of ram. 我有一个1360x92735的csv数据集,我必须使用FSelector软件包(R)来降低尺寸(information.gain()),但它需要很多内存。

My question is, can I use the ff package in combination with FSelector? 我的问题是,我可以将ff包与FSelector结合使用吗? If yes, how? 如果是,怎么办?

ps I have 8GB of ram and 8GB of swap on linux. PS:我在Linux上有8GB的RAM和8GB的交换空间。

Thanks. 谢谢。

[EDIT] [编辑]

I've try to use ff and FSelector package with iris dataset. 我尝试将ff和FSelector包与虹膜数据集一起使用。 It seems to work well, but now I've a problem with ff. 看来效果很好,但现在我遇到了ff问题。

My csv dataset is 1303x92735 and when I try to use an ff object to convert a dataframe with as.ffdf(), or to directly load dataset with read.csv.ffdf(), R crash with "write error". 我的csv数据集是1303x92735,当我尝试使用ff对象转换为as.ffdf()的数据帧,或直接使用read.csv.ffdf()加载数据集时,R崩溃,出现“写入错误”。

Here someone has same problem, but I don't understand if reachs a solution or not. 这里有人遇到同样的问题,但我不知道是否能解决。

Thanks. 谢谢。

The error is likely due to the fact that ff opens a file for each column in the ff data frame. 该错误很可能是由于ff为ff数据帧中的每一列打开了一个文件。 You have 92,735 columns which is likely to be many more than your system configuration for the max number of open files. 您有92,735列,这可能比系统配置中打开文件的最大数量更多。 I've answered this on SO here . 我已经在这里回答

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM