简体   繁体   中英

What's the best way to read mixed (i.e. text and binary) data?

I need to be able to read a file format that mixes binary and non-binary data. Assuming I know the input is good, what's the best way to do this? As an example, let's take a file that has a double as the first line, a newline (0x0D 0x0A) and then ten bytes of binary data afterward. I could, of course, calculate the position of the newline, then make a BinaryReader and seek to that position, but I keep thinking that there has to be a better way.

You can use System.IO.BinaryReader. The problem with this though is you must know what type of data you are going to be reading before you call any of the Read methods.

Read(byte[], int, int)
Read(char[], int, int)
Read()
Read7BitEncodedInt()
ReadBoolean()
ReadByte()
ReadBytes(int)
ReadChar()
ReadChars()
ReadDecimal()
ReadDouble()
ReadInt16()
ReadInt32()
ReadInt64()
ReadSByte()
ReadSingle()
ReadString()
ReadUInt16()
ReadUInt32()
ReadUInt64()

And of course the same methods exist for writing in System.IO.BinaryWriter.

Is this file format already fixed? If it's not, it's a really good idea to change to use a length-prefixed format for the strings. Then you can read just the right amount and convert it to a string.

Otherwise, you'll need to read chunks from the file, scan for the newline, and decode the right amount of data or (if you don't find the newline) either buffer it somewhere else (eg a MemoryStream) or just remember the starting point and rewind the stream appropriately. It will be ugly, but that's just because of the deficiency of the file format.

I would suggest you don't "over-decode" (ie decode the arbitrary binary data after the string) - while it may well not do any harm, in some encodings you could be reading an impossible sequence of binary data, which then starts getting into the realms of DecoderFallbacks and the like.

I've had to deal with that when reading HTTP requests coming in over the wire on Compact Framework. My solution was to roll my own non-buffering ASCII-only StreamReader, so that it was safe to interleave calls to both the StreamReader and the underlying Stream.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM