java nio 直接緩沖區上的壓縮

Question

gzip 輸入/輸出流不在 Java 直接緩沖區上運行。

有沒有直接在直接緩沖區上運行的壓縮算法實現？

這樣就沒有將直接緩沖區復制到 java 字節數組進行壓縮的開銷。

Answer 1

我並不是要貶低您的問題，但這真的是您程序中的一個很好的優化點嗎？ 您是否已使用分析器驗證您確實有問題？ 您所說的問題意味着您沒有進行任何研究，而只是猜測通過分配字節 [] 會出現性能或內存問題。 由於此線程中的所有答案都可能是某種黑客行為，因此在修復問題之前，您應該真正確認您確實遇到了問題。

回到這個問題，如果您想在 ByteBuffer 中“就地”壓縮數據，答案是否定的，Java 中沒有內置的功能。

如果你像下面這樣分配你的緩沖區：

byte[] bytes = getMyData();
ByteBuffer buf = ByteBuffer.wrap(bytes);

您可以按照上一個答案的建議通過 ByteBufferInputStream 過濾您的 byte[] 。

Answer 2

哇老問題，但今天偶然發現了這個。

可能一些像zip4j這樣的庫可以處理這個問題，但是自 Java 11 以來，您可以在沒有外部依賴的情況下完成工作：

如果您只對壓縮數據感興趣，您可以這樣做：

void compress(ByteBuffer src, ByteBuffer dst) {
    var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
    try {
        def.setInput(src);
        def.finish();
        def.deflate(dst, Deflater.SYNC_FLUSH);

        if (src.hasRemaining()) {
            throw new RuntimeException("dst too small");
        }
    } finally {
        def.end();
    }
}

src 和 dst 都會改變位置，所以你可能需要在 compress 返回后翻轉它們。

為了恢復壓縮數據：

void decompress(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
    var inf = new Inflater(true);
    try {
        inf.setInput(src);
        inf.inflate(dst);

        if (src.hasRemaining()) {
            throw new RuntimeException("dst too small");
        }

    } finally {
        inf.end();
    }
}

請注意，這兩種方法都希望（解）壓縮在一次傳遞中發生，但是，我們可以使用稍微修改過的版本來流式傳輸它：

void compress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) {
    var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
    try {
        def.setInput(src);
        def.finish();
        int cmp;
        do {
            cmp = def.deflate(dst, Deflater.SYNC_FLUSH);
            if (cmp > 0) {
                sink.accept(dst.flip());
                dst.clear();
            }
        } while (cmp > 0);
    } finally {
        def.end();
    }
}

void decompress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) throws DataFormatException {
    var inf = new Inflater(true);
    try {
        inf.setInput(src);
        int dec;
        do {
            dec = inf.inflate(dst);

            if (dec > 0) {
                sink.accept(dst.flip());
                dst.clear();
            }

        } while (dec > 0);
    } finally {
        inf.end();
    }
}

例子：

void compressLargeFile() throws IOException {
    var in = FileChannel.open(Paths.get("large"));
    var temp = ByteBuffer.allocateDirect(1024 * 1024);
    var out = FileChannel.open(Paths.get("large.zip"));

    var start = 0;
    var rem = ch.size();
    while (rem > 0) {
        var mapped=Math.min(16*1024*1024, rem);
        var src = in.map(MapMode.READ_ONLY, start, mapped);

        compress(src, temp, (bb) -> {
            try {
                out.write(bb);
            } catch (IOException e) {
                throw new UncheckedIOException(e);
            }
        });
        
        rem-=mapped;
    }
}

如果您想要完全壓縮兼容的數據：

void zip(ByteBuffer src, ByteBuffer dst) {
    var u = src.remaining();
    var crc = new CRC32();
    crc.update(src.duplicate());
    writeHeader(dst);

    compress(src, dst);

    writeTrailer(crc, u, dst);
}

在哪里：

void writeHeader(ByteBuffer dst) {
    var header = new byte[] { (byte) 0x8b1f, (byte) (0x8b1f >> 8), Deflater.DEFLATED, 0, 0, 0, 0, 0, 0, 0 };
    dst.put(header);
}

和：

void writeTrailer(CRC32 crc, int uncompressed, ByteBuffer dst) {
    if (dst.order() == ByteOrder.LITTLE_ENDIAN) {
        dst.putInt((int) crc.getValue());
        dst.putInt(uncompressed);
    } else {
        dst.putInt(Integer.reverseBytes((int) crc.getValue()));
        dst.putInt(Integer.reverseBytes(uncompressed));
    }

因此，zip 帶來了 10+8 個字節的開銷。

為了將直接緩沖區解壓縮到另一個緩沖區，您可以將 src 緩沖區包裝到 InputStream 中：

class ByteBufferInputStream extends InputStream {

    final ByteBuffer bb;

    public ByteBufferInputStream(ByteBuffer bb) {
        this.bb = bb;
    }

    @Override
    public int available() throws IOException {
        return bb.remaining();
    }

    @Override
    public int read() throws IOException {
        return bb.hasRemaining() ? bb.get() & 0xFF : -1;
    }

    @Override
    public int read(byte[] b, int off, int len) throws IOException {
        var rem = bb.remaining();

        if (rem == 0) {
            return -1;
        }

        len = Math.min(rem, len);

        bb.get(b, off, len);

        return len;
    }

    @Override
    public long skip(long n) throws IOException {
        var rem = bb.remaining();

        if (n > rem) {
            bb.position(bb.limit());
            n = rem;
        } else {
            bb.position((int) (bb.position() + n));
        }

        return n;
    }
}

並使用：

void unzip(ByteBuffer src, ByteBuffer dst) throws IOException {
    try (var is = new ByteBufferInputStream(src); var gis = new GZIPInputStream(is)) {
        var tmp = new byte[1024];

        var r = gis.read(tmp);

        if (r > 0) {
            do {
                dst.put(tmp, 0, r);
                r = gis.read(tmp);
            } while (r > 0);
        }

    }
}

當然，這並不酷，因為我們將數據復制到臨時數組，但無論如何，它是一種往返檢查，證明基於 nio 的 zip 編碼寫入了可以從標准的基於 io 的消費者讀取的有效數據。

因此，如果我們只是忽略 crc 一致性檢查，我們可以刪除頁眉/頁腳：

void unzipNoCheck(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
    src.position(src.position() + 10).limit(src.limit() - 8);

    decompress(src, dst);
}

Answer 3

如果您使用 ByteBuffers，您可以使用一些簡單的 Input/OutputStream 包裝器，例如：

public class ByteBufferInputStream extends InputStream {

    private ByteBuffer buffer = null;

    public ByteBufferInputStream( ByteBuffer b) {
        this.buffer = b;
    }

    @Override
    public int read() throws IOException {
        return (buffer.get() & 0xFF);
    }
}

public class ByteBufferOutputStream extends OutputStream {

    private ByteBuffer buffer = null;

    public ByteBufferOutputStream( ByteBuffer b) {
        this.buffer = b;
    }

    @Override
    public void write(int b) throws IOException {
        buffer.put( (byte)(b & 0xFF) );
    }

}

測試：

ByteBuffer buffer = ByteBuffer.allocate( 1000 );
ByteBufferOutputStream bufferOutput = new ByteBufferOutputStream( buffer );
GZIPOutputStream output = new GZIPOutputStream( bufferOutput );
output.write("stackexchange".getBytes());
output.close();

buffer.position( 0 );

byte[] result = new byte[ 1000 ];

ByteBufferInputStream bufferInput = new ByteBufferInputStream( buffer );
GZIPInputStream input = new GZIPInputStream( bufferInput );
input.read( result );

System.out.println( new String(result));

java nio 直接緩沖區上的壓縮

問題描述

3 個解決方案

解決方案1
2 已采納 2012-01-12 17:00:25

解決方案2
1 2021-04-16 18:41:49

解決方案3
0 2012-01-12 16:53:46

java nio 直接緩沖區上的壓縮

問題描述

3 個解決方案

解決方案1 2 已采納 2012-01-12 17:00:25

解決方案2 1 2021-04-16 18:41:49

解決方案3 0 2012-01-12 16:53:46

解決方案1
2 已采納 2012-01-12 17:00:25

解決方案2
1 2021-04-16 18:41:49

解決方案3
0 2012-01-12 16:53:46