简体   繁体   English

在Python中访问未知类型的Protobuf消息字段

[英]Accessing field of Protobuf message of unknown type in Python

Let's say I have 2 Protobuf-Messages, A and B. Their overall structure is similar, but not identical. 假设我有2个Protobuf-Messages,A和B.它们的整体结构相似,但不完全相同。 So we moved the shared stuff out into a separate message we called Common. 所以我们将共享的东西移到一个我们称为Common的单独消息中。 This works beautifully. 这很好用。

However, I'm now facing the following problem: A special case exists where I have to process a serialized message, but I don't know whether it's a message of type A or type B. I have a working solution in C++ (shown below), but I failed to find a way to do the same thing in Python. 但是,我现在面临以下问题:我需要处理序列化消息的特殊情况,但我不知道它是A类型还是B类消息。我有一个C ++工作解决方案(显示下面),但我没能找到在Python中做同样事情的方法。

Example: 例:

// file: Common.proto
// contains some kind of shared struct that is used by all messages:
message Common {
 ...
}

// file: A.proto
import "Common.proto";

message A {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;

   ... A-specific Fields ...
}

// file: B.proto
import "Common.proto";

message B {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;

   ... B-specific Fields ...
}

Working Solution in C++ C ++中的工作解决方案

In C++ I'm using the reflection API to get access to the CommonSettings field like this: 在C ++中,我使用反射API来访问CommonSettings字段,如下所示:

namespace gp = google::protobuf;
...
Common* getCommonBlock(gp::Message* paMessage)
{
   gp::Message* paMessage = new gp::Message();
   gp::FieldDescriptor* paFieldDescriptor = paMessage->GetDescriptor()->FindFieldByNumber(3);
   gp::Reflection* paReflection = paMessage->GetReflection();
   return dynamic_cast<Common&>(paReflection->GetMessage(*paMessage,paFieldDescriptor));
}

The method ' getCommonBlock ' uses FindFieldByNumber() to get hold of the descriptor of the field I'm trying to get. 方法' getCommonBlock '使用FindFieldByNumber()来获取我想要获取的字段的描述符。 Then it uses reflection to fetch the actual data. 然后它使用反射来获取实际数据。 getCommonBlock can process messages of type A, B or any future type as long as the Common field remains located at index 3. 只要Common字段仍位于索引3, getCommonBlock就可以处理A,B类型或任何未来类型的消息。

My Question is: Is there a way to do a similar thing Python? 我的问题是:有没有办法做类似的Python? I've been looking at the Protobuf documentation , but couldn't figure out a way to do it. 我一直在看Protobuf文档 ,但无法找到一种方法。

I know this is an old thread, but I'll respond anyway for posterity: 我知道这是一个老话题,但无论如何我会回应后人:

Firstly, as you know, it's not possible to determine the type of a protocol buffer message purely from its serialized form. 首先,如您所知,不可能纯粹从其序列化形式确定协议缓冲消息的类型。 The only information in the serialized form you have access to is the field numbers, and their serialized values. 您可以访问的序列化表单中唯一的信息是字段编号及其序列化值。

Secondly, the "right" way to do this would be to have a proto that contains both, like 其次,“正确”的方法是制作包含两者的原型,例如

message Parent {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;

   oneof letters_of_alphabet {
      A a_specific = 4;
      B b_specific = 5;
   }
}

This way, there's no ambiguity: you just parse the same proto ( Parent ) every time. 这样,就没有含糊之处:你每次只解析相同的proto( Parent )。


Anyway, if it's too late to change that, what I recommend you do is define a new message with only the shared fields, like 无论如何,如果改变它已经太晚了,我建议你做的是定义一个只有共享字段的新消息,比如

message Shared {
   required int32  FormatVersion             = 1;
   optional bool   SomeFlag [default = true] = 2;
   optional Common CommonSettings            = 3;
}

You should then be able to pretend that the message (either A or B ) is in fact a Shared , and parse it accordingly. 然后,您应该能够假装消息( AB )实际上是Shared ,并相应地解析它。 The unknown fields will be irrelevant. 未知领域将无关紧要。

One of the advantages of Python over a statically-typed language like C++ is that you don't need to use any special reflection code to get an attribute of an object of unknown type: you just ask the object. Python相对于像C ++这样的静态类型语言的一个优点是,您不需要使用任何特殊的反射代码来获取未知类型的对象的属性:您只需要询问该对象。 The built-in function that does this is getattr , so you can do: 执行此操作的内置函数是getattr ,因此您可以执行以下操作:

settings_value = getattr(obj, 'CommonSettings')

I had a similar problem. 我遇到了类似的问题。

What I did was to create a new message, with an enum specifying the type: 我所做的是创建一个新消息,枚举指定类型:

enum TYPE {
  A = 0;
  B = 1;
}
message Base {
  required TYPE type = 1;
  ... Other common fields ...
}

Then create specific message types: 然后创建特定的消息类型:

message A {
  required TYPE type = 1 [default: A];
  ... other A fields ...
}

And: 和:

message B {
  required TYPE type = 1 [default: B];
  ... other B fields ...
}

Be sure to define correctly the 'Base' message, or you won't be binary compatible if you add fields lately (as you will have to shift inheriting message fields too). 确保正确定义“基本”消息,或者如果最近添加字段,则不会是二进制兼容的(因为您也必须转移继承消息字段)。

That way, you can recive a generic message: 这样,您可以收到一般信息:

msg = ... receive message from net ...

# detect message type
packet = Base()
packet.ParseFromString(msg)

# check for type
if packet.type == TYPE.A:
    # parse message as appropriate type
    packet = A()
    packet.ParseFromString(msg)
else:
    # this is a B message... or whatever

# ... continue with your business logic ...

Hope this helps. 希望这可以帮助。

How about "concatenating" two protocol buffers in a header+payload format, eg header as the common data follows by either message A or B as suggested by protobuf techniques ? 如何在标头+有效载荷格式中“连接”两个协议缓冲区,例如标头作为公共数据,遵循protobuf技术建议的消息A或B?

This is how I did it with various types of payload as blob within mqtt message. 这就是我在mqtt消息中使用各种类型的有效负载作为blob的方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM