[英]How to handle database purging in Mongodb
I use mongodb for storing 30 day data which come to me as a stream. 我使用mongodb存储30天的数据,这些数据作为流来到我这里。 I am searching for a purging mechanism by which I can throw away oldest data to create room for new data.
我正在寻找一种清除机制,通过它我可以丢弃最旧的数据,为新数据创造空间。 I used to use mysql in which I handled this situation using partitions.
我以前使用mysql,我使用分区处理这种情况。 I kept 30 partitions which are date based.
我保留了30个以日期为基础的分区。 I delete the oldest dated partition and created a new partition to hold new data.
我删除了最旧的日期分区并创建了一个新分区来保存新数据。
When I map the same thing in mongodb, I feel like using a date based 'shards'. 当我在mongodb中映射相同的东西时,我觉得使用基于日期的“分片”。 But the problem is that it makes my data distribution bad.
但问题是它使我的数据分发变坏。 If all the new data are in the same shard, then that shard will be so hot as there are lot of people accessing them and the shards containing older data will be less loaded by users.
如果所有新数据都在同一个分片中,那么该分片将会很热,因为有很多人访问它们,并且包含旧数据的分片将减少用户的负载。
I can have a collection based purging. 我可以有一个基于集合的清除。 I can have 30 collections and I can throw away the oldest collection to accommodate new data.
我可以有30个收藏品,我可以丢弃最旧的收藏品以容纳新数据。 But couple of problems are 1) If I make collections smaller then I cannot benefit much from sharding as they are done per collection.
但是有几个问题是1)如果我将集合缩小,那么我不能从分片中获益,因为它们是按照每个集合完成的。 2) My queries have to change to query from all 30 collections and take an union.
2)我的查询必须更改为从所有30个集合中查询并进行联合。
Please suggest me a good purging mechanism (if any) to handle this situation. 请建议我一个很好的清除机制(如果有的话)来处理这种情况。
There are really only three ways to do purging in MongoDB. 在MongoDB中只有三种方法可以进行清除。 It looks like you've already identified several of the trade-offs.
看起来你已经确定了几个权衡因素。
Option #1: single collection 选项#1:单一集合
pros 利弊
cons 缺点
Option #2: collection per day 选项#2:每天收集
pros 利弊
collection.drop()
is very fast. collection.drop()
删除数据非常快。 cons 缺点
Option #3: database per day 选项#3:每天数据库
pros 利弊
cons 缺点
Now there is an option #4, but it is not a general solution. 现在有一个选项#4,但它不是一般解决方案。 I know of some people who did "purging" by simply using Capped Collections .
我知道有些人只是使用Capped Collections来“清除”。 There are definitely cases where this works, but it has a bunch of caveats, so you really need to know what you're doing.
肯定有这样的情况,但它有一些警告,所以你真的需要知道你在做什么。
we can set TTL for collection from mongodb 2.2 release or higher. 我们可以从mongodb 2.2版本或更高版本中设置TTL用于收集。 this will help you to expire old data from collection.
这将帮助您从集合中过期旧数据。
Follow this link: http://docs.mongodb.org/manual/tutorial/expire-data/ 请点击此链接: http : //docs.mongodb.org/manual/tutorial/expire-data/
I had a similar situation and this page helped me out, especially the "Helpful Scripts" section at the bottom. 我有类似的情况,这个页面帮助了我,特别是底部的“有用的脚本”部分。 http://www.mongodb.org/display/DOCS/Excessive+Disk+Space
http://www.mongodb.org/display/DOCS/Excessive+Disk+Space
最好将一台服务器保存为存档执行15天间隔清除从存档中删除旧存档。使用更多数据分区进行存档
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.