简体   繁体   中英

Protobuf-net is incompatible with official google Protobuf for C++ (message encoding)

We had some (lots of) classes in .NET. We used protobuf-net to mark them up, and generate .proto wrappers for C++ code side via google original library .

So I have a message (C++ DebugString() on some EventBase class (in .NET EventCharacterMoved inherits EventBase while in C++ I just write to optional property)):

UserId: -2792
EventCharacterMoved {
  Coordinates {
    Position {
      X: 196.41913
      Y: 130
      Z: 213
    }
    Rotation {
      X: 207
      Y: 130
      Z: 213
    }
  }
  OldCoordinates {
    Position {
      X: 196.41913
      Y: 130
      Z: 213
    }
    Rotation {
      X: 207
      Y: 130
      Z: 213
    }
  }
}

(From such .proto file)

message Coordinates {
   optional TreeFloat Position = 1;
   optional TreeFloat Rotation = 2;
}
message EventBase {
   optional int32 UserId = 10 [default = 0];
   // the following represent sub-types; at most 1 should have a value
   optional EventCharacterMoved EventCharacterMoved = 15;
}
message EventCharacterMoved {
   optional Coordinates Coordinates = 100;
   optional Coordinates OldCoordinates = 101;
}
message TreeFloat {
   optional float X = 1 [default = 0];
   optional float Y = 2 [default = 0];
   optional float Z = 3 [default = 0];
}

In C++ I send this and we send the same message contents from .NET.

The C++ code can parse C++ encoded message as well as the .NET encoded one. The .NET code can only parse the .NET message.

Over the wire we get 87 bytes flying (same size from .Net file and C++ file ) yet contents are different:

在此处输入图片说明

As you can see its similar yet not same. As a result of such difference CPP code can read .NET C# messages while .NET can not read CPP messages .

In code on deserialization we get:

An unhandled exception of type 'System.InvalidCastException' occurred in TestProto.exe

Additional information: Unable to cast object of type 'TestProto.EventBase' to type 'TestProto.EventCharacterMoved'.

in code like:

using (var inputStream = File.Open(@"./cpp_in.bin", FileMode.Open, FileAccess.Read)) {
    var ecm = Serializer.Deserialize<EventCharacterMoved>(inputStream);
}

Let's look at (as mentioned by jpa in his comment) protoc --decode_raw option:

在此处输入图片说明

This can be related to the fact that my CPP wrapper uses latest google protobuf version while protobuf-net probably uses some older encoding format or something like this...

So I wonder how to make .NET protobuf read C++ messages (make tham capable of decoding same stuff)?

Or at least how to make original google protobuf encode same way .NET protobuf does?

And for those who are really interested and would like to get into it zipped bundle with simplified example (VS 2010 solutions for C++ and C# code included)

Edit; this should be fixed in r616 and above.


I've finally had chance to look at this (apologies for delay, but social seasonal holiday demands intervened). I understand what is happening now.

Basically, the data is theoretically identical; what this actually comes down to is field-ordering. Technically, fields are usually written in ascending order, but can be expected in any order. With regards to protobuf-net; for types that don't involve inheritance it will work fine regardless of order. The protobuf specification does not define inheritance, so protobuf-net adds support for that (due to constant demand) additionally to the specification. As an implementation feature, it writes the sub-class information first (ie field 15, the sub-type, is written ahead of field 10). At the current time, during deserialization it also expects the sub-type information first. This has rarely impacted anyone, because since protobuf-net is the only implementation that uses inheritance like this, use of the inheritance feature is mostly only seen with protobuf-net to protobuf-net usage.

In your case, you're using .proto to interop with CPP; which means the CPP code will be able to consume to protobuf-net data, but it may have a type-cast exception going the other way (basically, it starts constructing the concrete type at the time it gets the first data field).

Despite rarely being an issue, this is something that needs fixing. I can try to look at this later today or tomorrow.

Options:

  • make sure the sub-type fields are always lower than any data fields
  • if you know it is expecting the sub-type, use the Merge API and pass in an existing new object of the desired type - this will then populate the existing object correctly
  • wait a day or two (hopefully!) use build r616 or above for a proper fix
  • avoid inheritance (and other implementation-specific features) when using interop
    • note you can model the same data without inheritance, via encapsulation - and it will work happily; it is specifically the creation of the concrete type that is the issue here
  • go to unreasonable lengths (meaning: I don't consider this an actual solution) when constructing the data from the CPP site, by writing it in two pieces:
    • write an EventBase with just the EventCharacterMoved data first, and serialize; now in a separate model write an EventBase with just the TreeFloat data, and serialize; this will simulate writing them in the required order (protobuf streams are appendable) - not pretty

This looks pretty similar to the problems noted in http://code.google.com/p/protobuf-net/issues/detail?id=299 and http://code.google.com/p/protobuf-net/issues/detail?id=331 which were allegedly fixed by http://code.google.com/p/protobuf-net/source/detail?r=595

Is the version of .NET protobuf you're using new enough to have incorporated that fix?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM