简体繁体 English

Python - 高性能序列化数据的最佳方式？

[英]Python - the best way to serialize data with high performance?

原文 2012-04-08 18:48:51 6 3 python/ serialization/ marshalling/ pickle

I need to serialize a data which requires high performance.我需要序列化需要高性能的数据。 Separated thread will be accessing it each second and must to load data to memory. There will be about 1000 - 10000 dictionary-like entries about user sessions (id, sessid, login date).分离的线程将每秒访问它，并且必须将数据加载到 memory。将有大约 1000 - 10000 个关于用户会话（id、sessid、登录日期）的类似字典的条目。 Some data will be frequently updated because login time has some time.有些数据会因为登录时间有一些时间而频繁更新。

These data will be shared between python server and Django application.这些数据将在 python 服务器和 Django 应用程序之间共享。 I think to use pickle or its faster version cPickle.我想使用泡菜或其更快的版本 cPickle。 I also found marshal .我也找到了marshal 。

What is the best way to do that?最好的方法是什么？ Is cPickle efficient enough? cPickle 是否足够高效？ Or maybe marshal?或者元帅？

EDIT : Very important thing is access time.编辑：非常重要的是访问时间。 It's going to be a realtime websocket server so I need very low delays.这将是一个实时 websocket 服务器，所以我需要非常低的延迟。 Is it faster to access cPickle data once a second or to connect to database like Redis?每秒访问一次 cPickle 数据还是像 Redis 那样连接到数据库更快？

3 个解决方案

A better approach may be to use some in-memory cache - memcached, if your needs are simple - or something with a bit more of a feature set, like redis.更好的方法可能是使用一些内存缓存——memcached，如果你的需求很简单——或者一些具有更多功能集的东西，比如 redis。

redis still requires serializing any complex python object , so redis doesn't solve this problem unless you represent all your data as simple keys and simple values. redis仍然需要序列化任何复杂python object ，因此redis不能解决这个问题，除非您将所有数据表示为简单的键和简单的值。 redis is not a deserialization solution, it's just a data store for strings. redis不是反序列化解决方案，它只是字符串的数据存储。 and in any case, redis is one of the slower options: https://charlesleifer.com/blog/completely-un-scientific-benchmarks-of-some-embedded-databases-with-python/无论如何， redis是较慢的选项之一： https://charlesleifer.com/blog/completely-un-scientific-benchmarks-of-some-embedded-databases-with-python/

Use a real database in memory. Don't use pickle, cPickle or marshal, or anything like that.使用 memory 中的真实数据库。不要使用 pickle、cPickle 或 marshal 或类似的东西。