[英]Are the raw bytes written by .NET System.IO.BinaryWriter readable by other platforms?
I am manually writing a large data block into a binary file with System.IO.BinaryWriter
. 我正在使用
System.IO.BinaryWriter
手动将大数据块写入二进制文件。 I have chosen this due to the improved performance compared to a wide variety of other means of serialization & deserialization (I am currently deserializing with System.IO.BinaryReader
). 我之所以选择它,是因为与各种其他序列化和反序列化方法相比,性能有所改善(我目前正在使用
System.IO.BinaryReader
反序列化)。
I may need to use the serialized formats in other programming languages like Java
and/or Rust
. 我可能需要在
Java
和/或Rust
等其他编程语言中使用序列化格式。 Would they be able to understand the raw binary written by System.IO.BinaryWriter
and read it in a similar manner to .NETs 'System.IO.BinaryReader'? 他们是否能够理解
System.IO.BinaryWriter
编写的原始二进制文件并以类似于.NETs'System.IO.BinaryReader'的方式读取它?
(I am assuming that the new plaforms (Java/Rust) will have implicit knowledge of the specific order in which the raw binary was written.) (我假设新平台(Java / Rust)将对原始二进制文件的写入特定顺序具有隐式知识。)
I am aware that protocol buffers is meant to be a performant and language agnostic framework for serializing/deserializing in this scenario but: (1) I am using F# and it struggles with the discriminated unions (2) It wasn't really that much effort to write my own custom serializer as my types aren't too complex 我知道协议缓冲区在这种情况下本来是用于序列化/反序列化的高性能和语言不可知的框架,但是:(1)我正在使用F#,并且它与被区分的联合不兼容(2)确实没有那么多工作写我自己的自定义序列化程序,因为我的类型不太复杂
It depends on the types you write with the BinaryWriter
. 这取决于您使用
BinaryWriter
编写的类型。
byte
, sbyte
and byte[]
: no problem. byte
, sbyte
和byte[]
:没问题。 (U)IntXX
: matter of endianness. (U)IntXX
:字节序的问题。 The .NET BinaryWriter
dumps these types in little endian format. BinaryWriter
以小字节序格式转储这些类型。 float
and double
: If both systems use the same IEEE 754 standard, and both systems use the same endianness, then it is no problem. float
和double
:如果两个系统使用相同的IEEE 754标准,并且两个系统使用相同的字节序,则没有问题。 decimal
: This is a .NET-specific type, similar to Currency
but uses different format. decimal
:这是.NET特定的类型,类似于Currency
但使用不同的格式。 Use carefully. char
and char[]
: Uses the current Encoding
of the BinaryWriter
. char
和char[]
:使用BinaryWriter
的当前Encoding
。 Use the same encoding on both sides and everything is alright. string
: The length of the string is encoded in the so-called 7 bit-encoded int format (1 byte for up to 127 chars, etc), and uses the current encoding. string
: string
的长度以所谓的7位编码的int格式(1个字节,最多127个字符,等等)进行编码, 并使用当前的编码。 To make things compatible maybe you should dump character arrays with manually dumped length information. Yes, you can. 是的你可以。
bool --> 0 | 1
sbyte --> x
byte[] --> xxxxxx
char[] --> encoding.getbytes(char[])
byte --> x
char -->
decimal --> decimal.GetBytes(), 16 bytes, should see the System.Decimal class code
double --> 8 bytes, should see the System.Double class code
short --> 2 bytes, <lsb><msb>
int --> 4 byets, <lsb>xx<msb>
long --> 8 bytes, <lsb>xxxxxx<msb>
float --> 4 bytes, should see the System.Single class code
string --> 7 bit encoded length (variable size) + encoding.GetBytes(), see 7 bit encoding method below
ushort --> same as short
uint --> same as int
ulong --> same as long
For numeric types, data is written in Little Endian Format 对于数字类型,数据以Little Endian格式写入
protected void Write7BitEncodedInt(int value)
{
uint num = (uint) value;
while (num >= 0x80)
{
this.Write((byte) (num | 0x80));
num = num >> 7;
}
this.Write((byte) num);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.