简体   繁体   中英

importing compressed csv into 'h2o' using r

The 'h2o' package is a fun ML java tool that is accessible via R. The R package for accessing 'h2o' is called " h2o ".

One of the input avenues is to tell 'h2o' where a csv file is and let 'h2o' upload the raw CSV. It can be more effective to just point out the folder and tell 'h2o' to import "everything in it" using the h2o.importFolder command.

Is there a way to point out a folder of "gzip" or "bzip" csv files and get 'h2o' to import them?

According to this link ( here ) the h2o can import compressed files. I just don't see the way to specify this for the importFolder approach.

Is it faster or slower to import the compressed form? If I have another program that makes output does it save me time in the h2o import process speed if they are compressed? If they are raw text? Guidelines and performance best practices are appreciated.

as always, comments, suggestions, and feedback are solicited.

I took the advice of @screechOwl and asked on the 0xdata.atlassian.net board for h2o and was given a clear answer:

It was supplied by user "cliff" .

Hi, yes H2O - when importing a folder - takes all the files in the folder; it unzips gzip'd or zip'd files as needed, and parses them all into one large CSV. All the files have to be compatible in the CSV sense - same number and kind of columns.

H2O does not currently handle bzip files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM