简体繁体 English

Amazon SSD支持的EC2上的MongoDB

[英]MongoDB on Amazon SSD-backed EC2

原文 2013-11-29 08:21:32 7 1 mongodb/ amazon-ec2/ amazon/ ssd

We have mongodb sharded cluster currently deployed on EC2 instances in Amazon. 目前，我们在Amazon的EC2实例上部署了mongodb分片群集。 These shards are also replica sets. 这些碎片也是副本集。 The instances used are using EBS with IOPS provisioned. 使用的实例使用已配置IOPS的EBS。

We have about 30 million documents in a collection. 集合中大约有3000万份文档。 Our queries count the whole collection that matches the filters. 我们的查询计算与过滤器匹配的整个集合。 We have indexes on almost all of the query-able fields. 我们几乎在所有可查询字段上都有索引。 This results to the RAM reaching 100% usage. 这导致RAM达到100％使用率。 Our working set exceeds the size of the RAM. 我们的工作集超出了RAM的大小。 We think that the slow response of our queries are caused by EBS being slow so we are thinking of migrating to the new SSD-backed instances. 我们认为我们查询的响应速度慢是由于EBS速度慢导致的，因此我们正在考虑迁移到新的SSD支持的实例。

C3 is available http://aws.typepad.com/aws/2013/11/a-generation-of-ec2-instances-for-compute-intensive-workloads.html 可以使用C3 http://aws.typepad.com/aws/2013/11/a-generation-of-ec2-instances-for-compute-tensive-workloads.html

I2 is coming soon http://aws.typepad.com/aws/2013/11/coming-soon-the-i2-instance-type-high-io-performance-via-ssd.html I2即将推出http://aws.typepad.com/aws/2013/11/coming-soon-the-i2-instance-type-high-io-performance-via-ssd.html

Our only concern is that SSD is ephemeral, meaning the data will be gone once the instance stops, terminates, or fails. 我们唯一需要担心的是SSD是短暂的，这意味着一旦实例停止，终止或失败，数据就会消失。 How can we address this? 我们该如何解决？ How do we automate backups. 我们如何自动执行备份。 Is it a good idea to migrate to SSD to improve the performance of our queries? 迁移到SSD以提高查询性能是一个好主意吗？ Do we still need to set-up a sharded cluster? 我们仍然需要设置分片集群吗？

1 个解决方案

Working with the ephemeral disks is a risk but if you have your replication setup correctly it shouldn't be a huge concern. 使用临时磁盘是有风险的，但是如果您正确地设置了复制设置，则不必担心。 I'm assuming you've setup a three node replica set correct? 我假设您已经设置了正确的三节点副本集？ Also you have three nodes for your config servers? 另外，您的配置服务器有三个节点吗？

I can speak of this from experience as the company I'm at has been setup this way. 我可以从经验中谈到这一点，因为我所在的公司已经以这种方式建立。 To help mitigate risk I'm moving towards a backup strategy that involved a hidden replica. 为了帮助降低风险，我正在采用涉及隐藏副本的备份策略。 With this setup I can shutdown the hidden replica set and one of the config servers (first having stopped balancing) and take a complete copy of the data files (replica and config server) and have a valid backup. 通过此设置，我可以关闭隐藏的副本集和其中一台配置服务器（首先已停止平衡），并获取数据文件的完整副本（副本和配置服务器）并具有有效的备份。 If AWS went down on my availability zone I'd still have a daily backup available on S3 to restore from. 如果AWS不在我的可用区上，我仍然可以在S3上获得每日备份以从中还原。

Hope this helps. 希望这可以帮助。