简体   繁体   English

如何在Java中正确实现LZ4,Snappy或等效压缩技术?

[英]How to correctly implement LZ4, Snappy or equivalent compression techniques in Java?

I've tried implementing Java version of LZ4 into a search engine kind of program trying to search data from large text files. 我尝试将Java版本的LZ4实现为一种试图从大文本文件中搜索数据的搜索引擎程序。 I simply compressed the outputstream and stored it into txt files or files without names. 我只是压缩输出流并将其存储到没有名称的txt文件或文件中。 However, I realized the supposedly compressed files did not reduce in size, but it's even larger in size than original files. 但是,我意识到所谓的压缩文件的大小没有减小,但它的大小甚至比原始文件大。

At last I had to resort to zip4j since it works for me. 最后我不得不求助于zip4j,因为它对我有用。

I wonder how may I approach using jars of LZ4 or Snappy to compress/decompress correctly? 我想知道如何使用LZ4或Snappy罐来正确压缩/解压缩?

In addition, how may I use such algorithms to compress a single folder with many files inside? 另外,我如何使用这些算法压缩包含许多文件的单个文件夹?

Thanks! 谢谢!

I faced a similar problem. 我遇到了类似的问题。 I was trying to send a large file (~ 709 MB) over local network in chunks of 8192 bytes. 我试图通过本地网络以8192字节的块发送大文件(~709 MB)。 I used Lz4 compression/decompression to reduce the network bandwidth. 我使用Lz4压缩/解压缩来减少网络带宽。

So assuming you are trying to do something similar, here's my suggestion : 所以假设你正在尝试做类似的事情,这是我的建议:

Here's the snippet of similar regular example you'll find on https://github.com/jpountz/lz4-java 这是您在https://github.com/jpountz/lz4-java上可以找到的类似常规示例的片段

private static int decompressedLength;
private static LZ4Factory factory = LZ4Factory.fastestInstance();
private static LZ4Compressor compressor = factory.fastCompressor();

public static byte[] compress(byte[] src, int srcLen) {
    decompressedLength = srcLen;
    int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
    byte[] compressed = new byte[maxCompressedLength];
    compressor.compress(src, 0, decompressedLength, compressed, 0, maxCompressedLength);
    return compressed;
}

Now if you return the compressed byte array as it is then there are fair chances that it may have length greater than the original uncompressed data. 现在,如果按原样返回压缩字节数组,那么它的长度可能大于原始未压缩数据。

So you can modify it as follows : 所以你可以修改如下:

private static int decompressedLength;
private static LZ4Factory factory = LZ4Factory.fastestInstance();
private static LZ4Compressor compressor = factory.fastCompressor();

public static byte[] compress(byte[] src, int srcLen) {
    decompressedLength = srcLen;
    int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
    byte[] compressed = new byte[maxCompressedLength];
    int compressLen = compressor.compress(src, 0, decompressedLength, compressed, 0, maxCompressedLength);
    byte[] finalCompressedArray = Arrays.copyOf(compressed, compressLen);
    return finalCompressedArray;
}

compressLen stores the actual compressed length and the finalCompressedArray byte array (of length compressLen) stores the actual compressed data. compressLen存储实际压缩长度, finalCompressedArray字节数组(长度为compressLen)存储实际压缩数据。 It's length, in general, is less than both the lengths of compressed byte array and original uncompressed byte array 通常,它的长度小于压缩字节数组和原始未压缩字节数组的长度

Now you can decompress the finalCompressedArray byte array in regular fashion as below : 现在,您可以按常规方式解压缩finalCompressedArray字节数组,如下所示:

private static LZ4FastDecompressor decompressor = factory.fastDecompressor();

public static byte[] decompress(byte[] finalCompressedArray, int decompressedLength) {
    byte[] restored = new byte[decompressedLength];
    restored = decompressor.decompress(finalCompressedArray, decompressedLength);
    return restored;
}

A .jar file is a .zip file. .jar文件是.zip文件。 The zip file format does not support LZ4 or Snappy. zip文件格式不支持LZ4或Snappy。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM