简体繁体 English

在EC2 / EBS上的PostgreSQL性能

[英]PostgreSQL performance on EC2/EBS

原文 2010-06-10 13:58:45 8 1 performance/ postgresql/ amazon-ec2/ amazon-web-services/ amazon-ebs

What gives best performance for running PostgreSQL on EC2? 在EC2上运行PostgreSQL的最佳性能是什么？ EBS in RAID? RAID中的EBS？ PGData on /mnt? PGData on / mnt？

Do you have any preferences or experiences? 你有任何偏好或经历吗？ Main "plus" for running PostgreSQL on EBS is switching from one to another instance. 在EBS上运行PostgreSQL的主要“加号”是从一个实例切换到另一个实例。 Can this be the reason to be slower than using the /mnt partition? 这可能是比使用/ mnt分区慢的原因吗？

PS: I'm running PostgreSQL 8.4 with datas/size about 50G, Amazon EC2 xlarge(64) instance. PS：我正在运行PostgreSQL 8.4，数据/大小约为50G，Amazon EC2 xlarge（64）实例。

1 个解决方案

Here there is some linked info. 这里有一些链接信息。 The main take-away is this post from Bryan Murphy: 主要外卖是Bryan Murphy的这篇文章：

Been running a very busy 170+ gb OLTP postgres database on Amazon for 1.5 years now. 已经在亚马逊上运行了一个非常繁忙的170+ gb OLTP postgres数据库已有1.5年了。 I can't say I'm "happy" but I've made it work and still prefer it to running downtown to a colo at 3am when something goes wrong. 我不能说我很“开心”，但我已经让它工作了，并且仍然更喜欢它在凌晨3点运行到市中心时出现问题。

There are two main things to be wary of: 有两件事要警惕：

1) Physical I/O is not very good, thus how that first system used a RAID0. 1）物理I / O不是很好，因此第一个系统如何使用RAID0。

Let's be clear here, physical I/O is at times terrible . 让我们在这里清楚，物理I / O有时很糟糕。 :) :)

If you have a larger database, the EBS volumes are going to become a real bottleneck. 如果您有一个更大的数据库，EBS卷将成为一个真正的瓶颈。 Our primary database needs 8 EBS volumes in a RAID drive and we use slony to offload requests to two slave machines and it still can't really keep up. 我们的主数据库在RAID驱动器中需要8个EBS卷，我们使用slony将请求卸载到两个从机，但它仍然无法真正跟上。

There's no way we could run this database on a single EBS volume. 我们无法在单个EBS卷上运行此数据库。

I also recommend you use RAID10, not RAID0. 我还建议你使用RAID10，而不是RAID0。 EBS volumes fail. EBS卷失败。 More frequently, single volumes will experience very long periods of poor performance. 更常见的是，单卷将经历很长一段时间的不良表现。 The more drives you have in your raid, the more you'll smooth things out. 你的raid中拥有的驱动器越多，你就越能解决问题。 However, there have been occasions where we've had to swap out a poor performing volume for a new one and rebuild the RAID to get things back up to speed. 但是，在某些情况下，我们不得不换掉性能较差的新卷并重建RAID以使事情恢复速度。 You can't do that with a RAID0 array. 你不能用RAID0阵列做到这一点。

2) Reliability of EBS is terrible by database standards; 2）数据库标准对EBS的可靠性很差; I commented on this a bit already at http://archives.postgresql.org/pgsql-general/2009-06/msg00762.php The end result is that you must be careful about how you back your data up, with a continuous streaming backup via WAL shipping being the recommended approach. 我已经在http://archives.postgresql.org/pgsql-general/2009-06/msg00762.php对此进行了评论。最终结果是，您必须小心如何通过连续流式传输来备份数据通过WAL运输备份是推荐的方法。 I wouldn't deploy into this environment in a situation where losing a minute or two of transactions in the case of a EC2/EBS failure would be unacceptable, because that's something that's a bit more likely to hapen here than on most database hardware. 在EC2 / EBS故障情况下丢失一两分钟的事务是不可接受的情况下，我不会部署到这种环境中，因为这比在大多数数据库硬件上更容易出现这种情况。

Agreed. 同意。 We have three WAL-shipped spares. 我们有三个WAL运输的备件。 One streams our WAL files to a single EBS volume which we use for worst case scenario snapshot backups. 我们将WAL文件流式传输到单个EBS卷，我们将其用于最坏情况的快照备份。 The other two are exact replicas of our primary database (one in the west coast data center, and the other in an east coast data center) which we have for failover. 另外两个是我们主要数据库的精确复制品（一个在西海岸数据中心，另一个在东海岸数据中心），我们用于故障转移。

If we ever have to worst-case-scenario restore from one of our EBS snapshots, we're down for six hours because we'll have to stream the data from our EBS snapshot back over to an EBS raid array. 如果我们不得不从我们的一个EBS快照中恢复最坏情况的场景，那么我们将停机六个小时，因为我们必须将来自EBS快照的数据流回到EBS raid阵列。 170gb at 20mb/sec (if you're lucky) takes a LONG time. 170gb，20mb / sec（如果你很幸运）需要很长时间。 It takes 30 to 60 minutes for one of those snapshots to become "usable" once we create a drive from it, and then we still have to bring up the database and wait an agonizingly long time for hot data to stream back into memory. 一旦我们从中创建驱动器，其中一个快照就变得“可用”需要30到60分钟，然后我们仍然需要调出数据库并等待很长时间才能将热数据流回内存。

We had to fail over to one of our spares twice in the last 1.5 years. 在过去的1。5年中，我们不得不两次故障转移到我们的备件中。 Not fun. 不好玩。 Both times were due to instance failure. 两次都是由于实例失败。

It's possible to run a larger database on EC2, but it takes a lot of work, careful planning and a thick skin. 可以在EC2上运行更大的数据库，但需要大量的工作，仔细的计划和厚厚的皮肤。

Bryan 布赖恩