简体   繁体   English

如何在Ruby 1.8.7中延迟反序列化多个对象

[英]How to deserialize multiple objects in ruby 1.8.7 lazily

I need to serialize lots of objects to a file (multiple GBs). 我需要将许多对象序列化到一个文件(多个GB)。 We have chosen to use Google's protocol buffers for other things in this project, so I thought I would use that to serialize the objects I receive from the wire. 在该项目中,我们选择使用Google的协议缓冲区做其他事情,所以我认为我将使用它来序列化从电线接收的对象。 This seems to work: 这似乎可行:

File.open(file_name, 'ab') do |f|
  some_objects.each { |some_object|
    some_object.serialize(f)
  }
end

The deserializtion is what is giving me issues. 反序列化是给我问题的原因。 I have seen others do one object like this: 我已经看到其他人做这样的一个对象:

File.open(file_name, 'r') do |f|
  no = some_object.parse(f)
end

But that only does one. 但这只是一个。 I tried doing this: 我尝试这样做:

File.open(file_name, 'r').each do |f|
  no = some_object.parse(f)
end

But that raised this exception: 但这引发了以下异常:

Uncaught exception: undefined method `<<' for false:FalseClass

I need to get all of them and lazily evaluate them. 我需要得到所有这些,并懒惰地评估它们。 Any thoughts? 有什么想法吗? Please feel free to give any advice on performace of this code since I'll be doing GBs of info. 请随意提供有关此代码性能的任何建议,因为我将处理大量信息。 Thanks for your time. 谢谢你的时间。

By the way, I know I need to upgrade my ruby version, but since this is an internal thing I haven't been able to get time from the boss to upgrade it. 顺便说一句,我知道我需要升级我的红宝石版本,但是由于这是内部的事情,所以我没有时间从老板那里得到升级的时间。

I am using ruby-protocol-buffers 我正在使用ruby-protocol-buffers

Encoded protobufs are not self-delimiting, therefore if you write multiple to a stream and then try to parse them, the entire stream will be parsed as a single message, with latter field values overwriting earlier ones. 编码的protobuf不是自定界的,因此,如果您向流写入多个protobuf,然后尝试对其进行解析,则整个流将被解析为一条消息,后一个字段值将覆盖较早的字段值。 You will need to prefix each message with its size, then make sure only to read that many bytes on the receiving end. 您将需要为每个消息加上大小前缀,然后确保仅在接收端读取那么多字节。

https://developers.google.com/protocol-buffers/docs/techniques#streaming https://developers.google.com/protocol-buffers/docs/techniques#streaming

Unfortunately I don't know Ruby so I can't give you code samples. 不幸的是我不了解Ruby,所以我不能给你代码示例。 It looks like the class LimitedIO in the Ruby protobuf library you linked might be useful for parsing messages without going past a certain length. 您所链接的Ruby protobuf库中的类LimitedIO看起来对于解析消息而不用超过一定长度可能很有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM