[英]Caching huge data in Process memory
I am working in Finance Industry.我在金融行业工作。 We want to roll out Database hit for data processing.
我们希望推出数据库命中以进行数据处理。 It is very costly.
这是非常昂贵的。 So we are planning to go for on-demand cache logic.
所以我们计划将 go 用于按需缓存逻辑。 [ runtime insert & runtime lookup ]
[运行时插入和运行时查找]
Is anyone worked in implementation of Caching logic for more than 10 million of records?.是否有人为超过 1000 万条记录执行缓存逻辑? Per record is say about 160 - 200 bytes.
每条记录大约 160 - 200 字节。
I faced following disadvantages with different approach.我用不同的方法面临以下缺点。
Please suggest me something if you had come across this problem and solved by any means.如果您遇到此问题并通过任何方式解决,请给我一些建议。
Thanks谢谢
If your cache is a simple key-value store, you should not be using std::map
, which has O (log n ) lookup, but std::unordered_map
, which has O (1) lookup.如果您的缓存是一个简单的键值存储,则不应使用具有O (log n ) 查找的
std::map
,而应使用具有O (1) 查找的std::unordered_map
。 You should only use std::map
if you require sorting.如果你需要排序,你应该只使用
std::map
。
It sounds like performance is what you're after, so you might want to look at Boost Intrusive .听起来性能就是你所追求的,所以你可能想看看Boost Intrusive 。 You can easily combine
unordered_map
and list
to create a high-efficiency LRU.你可以很容易地结合
unordered_map
和list
来创建一个高效的 LRU。
Read everything into memory and create R&B tree for key access.将所有内容读入 memory 并为密钥访问创建 R&B 树。
http://www.mit.edu/~emin/source_code/cpp_trees/index.html http://www.mit.edu/~emin/source_code/cpp_trees/index.html
In one recent project, we had database with some 10s M records, and were using such strategy.在最近的一个项目中,我们有大约 10 条 M 记录的数据库,并且正在使用这种策略。
Your data weight is 2GB, from your post.从您的帖子来看,您的数据权重为 2GB。 With overhead, it will come up to say double.
有了开销,它会说双倍。 It's no problem for any 64bit architecture.
任何 64 位架构都没有问题。
I have recently changed the memory allocation of our product (3D medical volume viewer) to use good old memory mapped files.我最近更改了我们产品(3D 医疗卷查看器)的 memory 分配以使用旧的 memory 映射文件。
The advantages were:优点是:
In my case it was just data (mostly readonly).就我而言,它只是数据(主要是只读的)。 If you have a more complex data structure, this will be more work than using "normal" objects.
如果你有一个更复杂的数据结构,这将比使用“普通”对象做更多的工作。
You can actually share these across processes (if they're backed by a real file).您实际上可以跨进程共享这些(如果它们由真实文件支持)。 This may behave differently, I dont have experience with that.
这可能表现不同,我没有这方面的经验。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.