简体   繁体   English

Python Protobuf 中高效的消息字段设置

[英]Efficient message field setting in Python Protobuf

I am using Protobuf (v3.5.1) in a Python project I'm working on.我在我正在处理的 Python 项目中使用 Protobuf (v3.5.1)。 My situation can be simplified to the following:我的情况可以简化为:

// Proto file

syntax = "proto3";

message Foo {
    Bar bar = 1;
}

message Bar {
    bytes lotta_bytes_here = 1;
}

# Python excerpt
def MakeFooUsingBar(bar):
    foo = Foo()
    foo.bar.CopyFrom(bar)

I am worried about the memory performance of .CopyFrom() (If I am correct, it is copying contents, instead of the reference).我担心.CopyFrom()的内存性能(如果我是对的,它是在复制内容,而不是引用)。 Now, in C++, I could use something like:现在,在 C++ 中,我可以使用类似的东西:

Foo foo;
Bar* bar = new Bar();
bar->set_lotta_bytes_here("abcd");
foo.set_allocated_bar(bar);

Which looks like it does not need to copy anything judging by the generated source:从生成的源来看,它看起来不需要复制任何东西:

inline void Foo::set_allocated_bar(::Bar* bar) {
  ::google::protobuf::Arena* message_arena = GetArenaNoVirtual();
  if (message_arena == NULL) {
    delete bar_;
  }
  if (bar) {
    ::google::protobuf::Arena* submessage_arena = NULL;
    if (message_arena != submessage_arena) {
      bar = ::google::protobuf::internal::GetOwnedMessage(
          message_arena, bar, submessage_arena);
    }

  } else {

  }
  bar_ = bar;
  // @@protoc_insertion_point(field_set_allocated:Foo.bar)
}

Is there something similar available in Python? Python中有类似的东西吗? I have looked through the Python generated sources, but found nothing applicable.我查看了 Python 生成的源代码,但没有发现任何适用的内容。

When it comes to large string or bytes objects, it seems that Protobuf figures the situation fairly well.当涉及到大stringbytes对象时,Protobuf 似乎很好地描述了这种情况。 The following passes, which means that while a new Bar object is created, the binary array is copied by reference (Python bytes are immutable, so it makes sense):下面的过程,这意味着在创建一个新的Bar对象时,二进制数组是通过引用复制的(Python bytes是不可变的,所以这是有道理的):

def test_copy_from_with_large_bytes_field(self):
    bar = Bar()
    bar.val = b'12345'
    foo = Foo()
    foo.bar.CopyFrom(bar)

    self.assertIsNot(bar, foo.bar)
    self.assertIs(bar.val, foo.bar.val)

This solves my issue of large bytes object.这解决了我的大bytes对象问题。 However, if someone's problem lies in nested, or repeated fields, this will not help - such fields are copied field by field.但是,如果某人的问题在于嵌套或重复字段,这将无济于事 - 这些字段是逐个字段复制的。 It does make sense - if one copies a message, they want the two to be independent.这确实是有道理的——如果一个人复制了一条消息,他们希望两个人是独立的。 If they were not, making changes to the original message would modify the copied (and vice versa).如果不是,则对原始消息进行更改将修改复制的消息(反之亦然)。

If there is anything akin to the C++ move semantics ( https://github.com/google/protobuf/issues/2791 ) or set_allocated_...() in Python protobuf, that would solve it, however I am not aware of such a feature.如果在 Python protobuf 中有任何类似于 C++ 移动语义( https://github.com/google/protobuf/issues/2791 )或set_allocated_...() ,那就可以解决它,但是我不知道这样一项功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM