简体   繁体   English

Python - 高性能序列化数据的最佳方式?

[英]Python - the best way to serialize data with high performance?

I need to serialize a data which requires high performance.我需要序列化需要高性能的数据。 Separated thread will be accessing it each second and must to load data to memory. There will be about 1000 - 10000 dictionary-like entries about user sessions (id, sessid, login date).分离的线程将每秒访问它,并且必须将数据加载到 memory。将有大约 1000 - 10000 个关于用户会话(id、sessid、登录日期)的类似字典的条目。 Some data will be frequently updated because login time has some time.有些数据会因为登录时间有一些时间而频繁更新。

These data will be shared between python server and Django application.这些数据将在 python 服务器和 Django 应用程序之间共享。 I think to use pickle or its faster version cPickle.我想使用泡菜或其更快的版本 cPickle。 I also found marshal .我也找到了marshal

What is the best way to do that?最好的方法是什么? Is cPickle efficient enough? cPickle 是否足够高效? Or maybe marshal?或者元帅?

EDIT : Very important thing is access time.编辑:非常重要的是访问时间。 It's going to be a realtime websocket server so I need very low delays.这将是一个实时 websocket 服务器,所以我需要非常低的延迟。 Is it faster to access cPickle data once a second or to connect to database like Redis?每秒访问一次 cPickle 数据还是像 Redis 那样连接到数据库更快?

A better approach may be to use some in-memory cache - memcached, if your needs are simple - or something with a bit more of a feature set, like redis.更好的方法可能是使用一些内存缓存——memcached,如果你的需求很简单——或者一些具有更多功能集的东西,比如 redis。

redis still requires serializing any complex python object , so redis doesn't solve this problem unless you represent all your data as simple keys and simple values. redis仍然需要序列化任何复杂python object ,因此redis不能解决这个问题,除非您将所有数据表示为简单的键和简单的值。 redis is not a deserialization solution, it's just a data store for strings. redis不是反序列化解决方案,它只是字符串的数据存储。 and in any case, redis is one of the slower options: https://charlesleifer.com/blog/completely-un-scientific-benchmarks-of-some-embedded-databases-with-python/无论如何, redis是较慢的选项之一: https://charlesleifer.com/blog/completely-un-scientific-benchmarks-of-some-embedded-databases-with-python/

Use a real database in memory. Don't use pickle, cPickle or marshal, or anything like that.使用 memory 中的真实数据库。不要使用 pickle、cPickle 或 marshal 或类似的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 python 中导入高频更新数据的最佳方法是什么? - what is the best way to import high frequency updating data in python? 哪个是 Python 性能的最佳方式 - Which is the best way for performance in Python python-拆分数据集以获得高性能准确性的最佳技术 - python- Best techniques to split datase to get high performance accuracy 在 Python 中序列化/反序列化类对象的最佳方法是什么? - What is the best way to Serialize/Deserialize Class Objects in Python? 在要作为Numpy数组由Python打开的文件中序列化GeoTIFF的最佳方法 - Best way to serialize GeoTIFF in a file to be opened by Python as Numpy array 在Linux上的python中进行高频循环的最佳方法? - Best way to do a high frequency loop in python on Linux? 避免高负载Django应用中数据丢失的最佳方法? - Best way to avoid data loss in a high-load Django app? Python中的高性能视频编辑 - High performance video edit in Python Python 正则表达式性能:使用数千个正则表达式迭代文本的最佳方法 - Python regex performance: Best way to iterate over texts with thousands of regex 内联序列化 Dataframe 的最佳方法是什么? - What's the best way to serialize a Dataframe inline?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM