简体   繁体   中英

How can I apply ffdf to non-atomic data frames?

Many posts ( such as this ) claim the ff package is superior to bigmemory because it can handle objects w/ atomic and nonatomic components, but how? For example:

UNIT <- c(100,100, 200, 200, 200, 200, 200, 300, 300, 300,300)
STATUS <- c('ACTIVE','INACTIVE','ACTIVE','ACTIVE','INACTIVE','ACTIVE','INACTIVE','ACTIVE',
        'ACTIVE','ACTIVE','INACTIVE') 
TERMINATED <- as.Date(c('1999-07-06','2008-12-05','2000-08-18','2000-08-18','2000-08-18',
                    '2008-08-18','2008-08-18','2006-09-19','2006-09-19','2006-09-19',
                    '1999-03-15')) 
START <- as.Date(c('2007-04-23','2008-12-06','2004-06-01','2007-02-01','2008-04-19',
               '2010-11-29','2010-12-30','2007-10-29','2008-02-05','2008-06-30',
               '2009-02-07'))
STOP <- as.Date(c('2008-12-05','2012-12-31','2007-01-31','2008-04-18','2010-11-28',
              '2010-12-29','2012-12-31','2008-02-04','2008-06-29','2009-02-06',
              '2012-12-31'))
TEST <- data.frame(UNIT,STATUS,TERMINATED,START,STOP)
TEST                   

#install.packages('ff')            
library('ff')            
TEST2 <- ffdf(TEST)            
Error in ffdf(TEST) : ffdf components must be atomic ff objects

What can I do to make this work?

Using

TEST2 <- as.ffdf(TEST)   

instead of

TEST2 <- ffdf(TEST)   

will work.

Explanation: as.ffdf converts your data.frame to an ffdf. If you really want to use ffdf directly, you need to supply atomic ff vectors as the error message indicates. For the above example this would be

ffdf(UNIT = as.ff(UNIT), STATUS = as.ff(as.factor(STATUS)), TERMINATED = as.ff(TERMINATED), START = as.ff(START), STOP = as.ff(STOP))

See ?as.ffdf or ?ffdf, part of the ff package.

In real life, your data would be coming from other sources like csv or SQL sources instead of from a data.frame already in R. See package ETLUtils to get your data from SQL into ff easily.

I tried to coerce the columns of TEST data.frame to ff objects before the call to ffdf but this don't work. Here a workaround using read.csv.ffdf :

write.csv(TEST,file='test.csv')
TEST.ffd <- read.csv.ffdf(file='test.csv')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM