I have some data which takes up more than 50MB in an uncompressed file, but compresses down to less than half a MB using gzip
.
Most of this is numerical data. I'm trying to figure out how to process this data without having to uncompress it completely. For example, if this data contains a couple of strings and 5 or so numerical values per record, is there a way I can uncompress a single row (or a small set of rows), process them, then discard them?
Unix provides utilities such as zcat
, grep
, etc. which operate directly on compressed data, I'd like to do the same in Java.
Thanks
Just wrap your FileInputStream
in a GZipInputStream
:
public static BufferedReader createReader (File f, String encoding) throws IOException
{
try
{
InputStream in = new FileInputStream (f);
if (f.getName ().endsWith (".gz"))
in = new GZIPInputStream (in, 10240);
return new BufferedReader (new InputStreamReader (in, encoding));
}
catch (UnsupportedEncodingException e)
{
throw new RuntimeException("Missing encoding "+encoding, e);
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.