[英]Reading using DeflateStream doesn't match expected size
我正在將一組離散二進制數據寫入 stream,然后寫入磁盤。 我正在使用緩沖文件 stream 來減少磁盤使用。
BinaryWriter -> DeflateStream -> FileStream (buffered)
數據集由 header 部分(帶有一些信息)和一些壓縮的原始數據組成;
1. Signature, 1 byte.
2. Timestamp, 8 bytes.
3. Size of data uncompressed, 8 bytes
4. Data (compressed, using DeflateStream), X bytes
問題是讀取數據時,做反操作,stream 上的 position 與預期值不匹配。
1. Read signature, 1 byte.
2. Read timestamp, 8 bytes (long).
3. Read data size, 8 bytes (long).
4. Read compressed data, using DeflateStream, (above value) bytes.
這當然會破壞所有其他項目的閱讀。 對於大小為 240_000 的數據,讀取它導致讀取的結果不止於此。 由於我在原始數據之前寫入數據大小,因此讀取大小的操作正在運行。
問題出在DeflateStream
或者我如何使用它。
var fileStream = new FileStream(path, FileMode.Create, FileAccess.Write, FileShare.None, 200 * 1_048_576);
var binaryWriter = new BinaryWriter(fileStream);
var item = new Item
{
Signature = (byte)1,
TimeStamp = DateTime.Now.Ticks,
Data = new byte[] { .... }
}
binaryWriter.Write(item.Signature); //1 byte.
binaryWriter.Write(item.TimeStamp); //8 bytes.
binaryWriter.Write(item.Data.LongLength); //8 bytes.
//Reported position: 17 (1 + 8 + 8)
//Data Length: 240_000
using (var compressStream = new DeflateStream(fileStream, CompressionLevel.Optimal, true))
{
compressStream.Write(item.Data);
compressStream.Flush();
}
//Reported position: 8099 (1 + 8 + 8 + [compressed length])
//🔁 Repeats until all items are in cache.
await using var fileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
using var binaryReader = new BinaryReader(fileStream);
var items = new List<Item>();
while (fileStream.Position < fileStream.Length)
{
var item = new Item
{
StreamPosition = fileStream.Position
};
fileStream.Position += 1; //Skip signature.
item.TimeStampInTicks = binaryReader.ReadInt64(); //🆗
item.DataLength = binaryReader.ReadInt64(); //🆗, 240_000
//Reported position: 17 (1 + 8 + 8) //🆗
await using (var compressStream = new DeflateStream(fileStream, CompressionMode.Decompress, true))
using (var compressBinaryReader = new BinaryReader(compressStream))
{
compressBinaryReader.ReadBytes((int)item.DataLength);
//compressStream.ReadBytes((int)item.DataLength);
//Same results without reader.
}
//Reported position: 8306 //📛
//Expected position: 8099
items.Add(item);
}
我還必須存儲壓縮后的大小,然后手動重新定位 stream,因為DeflateStream
過沖。
var fileStream = new FileStream(path, FileMode.Create, FileAccess.Write, FileShare.None, 200 * 1_048_576);
var binaryWriter = new BinaryWriter(fileStream);
var item = new Item
{
Signature = (byte)1,
TimeStamp = DateTime.Now.Ticks,
Data = new byte[] { .... }
}
binaryWriter.Write(item.Signature); //1 byte.
binaryWriter.Write(item.TimeStamp); //8 bytes.
binaryWriter.Write(item.Data.LongLength); //8 bytes, uncompressed length.
var start = fileStream.Position;
binaryWriter.Write(0L); //8 bytes, compressed length.
using (var compressStream = new DeflateStream(fileStream, UserSettings.All.CaptureCompression, true))
{
compressStream.Write(item.Data);
compressStream.Flush();
}
var end = fileStream.Position;
var compressedLength = end - start - 8; //8 as the position was obtained before the size was written.
fileStream.Position = start;
binaryWriter.Write(compressedLength); //8 bytes, compressed length.
fileStream.Position = end;
await using var fileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
using var binaryReader = new BinaryReader(fileStream);
var items = new List<Item>();
while (fileStream.Position < fileStream.Length)
{
var item = new Item
{
StreamPosition = fileStream.Position
};
fileStream.Position += 1; //Skip signature.
item.TimeStampInTicks = binaryReader.ReadInt64();
item.DataLength = binaryReader.ReadInt64();
var compressedLength = binaryReader.ReadInt64();
var currentPosition = fileStream.Position;
await using (var compressStream = new DeflateStream(fileStream, CompressionMode.Decompress, true))
{
compressStream.ReadBytes((int)item.DataLength);
}
fileStream.Position = currentPosition + compressedLength;
items.Add(item);
}
這里的問題是DeflateStream
有自己的內部緩沖區。 當您從DeflateStream
讀取數據時 - 它從底層 stream 讀取數據(在本例中為您的fileStream
)。 它將它們讀取到自己的內部緩沖區,但它事先不知道應該讀取多少字節(因此數據到底在哪里結束),因此它總是嘗試讀取等於該緩沖區長度的字節數。 這意味着如果您的DeflateStream
在壓縮數據之后包含其他字節 - DeflateStream 在最后一次讀取時嘗試填充其內部緩沖區時過度讀取壓縮數據是完全沒問題的。 它不會使用這些字節,但會讀取它們,這將使您的FileStream
上的 position 超過壓縮數據。
因此,overread 並不表示 deflate 過程中有任何錯誤,但是您必須手動修復 position。 為此,您需要知道壓縮數據的大小。
旁注 - 在這種情況下最好不要使用BinaryReader
讀取數據 - 使用DeflateStream.Read
(不要忘記檢查此方法的返回值,指示實際讀取了多少字節)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.