简体   繁体   中英

How to process compressed data in Java

I have some data which takes up more than 50MB in an uncompressed file, but compresses down to less than half a MB using gzip .

Most of this is numerical data. I'm trying to figure out how to process this data without having to uncompress it completely. For example, if this data contains a couple of strings and 5 or so numerical values per record, is there a way I can uncompress a single row (or a small set of rows), process them, then discard them?

Unix provides utilities such as zcat , grep , etc. which operate directly on compressed data, I'd like to do the same in Java.

Thanks

Just wrap your FileInputStream in a GZipInputStream :

public static BufferedReader createReader (File f, String encoding) throws IOException
{
    try
    {
        InputStream in = new FileInputStream (f);
        if (f.getName ().endsWith (".gz"))
            in = new GZIPInputStream (in, 10240);

        return new BufferedReader (new InputStreamReader (in, encoding));
    }
    catch (UnsupportedEncodingException e)
    {
        throw new RuntimeException("Missing encoding "+encoding, e);
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM