[英]Sharding existing postgresql database with PostgresXL
We want to shard our PostgreSQL DB, due to high disk load. 由于磁盘负载高,我们希望分片PostgreSQL数据库。 Firstly, we looked at django-sharding library , but:
首先,我们看了django-sharding库 ,但是:
Considring all these, we decided too look on Postgres database sharding solutions. 考虑到所有这些,我们决定也考虑使用Postgres数据库分片解决方案。 We found 2 opportunities - Citus and PostgresXL.
我们找到2个机会-Citus和PostgresXL。 Citus makes us change data format too much and rewrite a big bunch of backend at the same time, so we are about to try PostgresXL as more transparent solution.
Citus使我们过多地更改了数据格式,并同时重写了大量后端,因此我们将尝试使用PostgresXL作为更透明的解决方案。 But reading the docs, I can't understand some things and will be greatfull for recomendations:
但是,阅读文档后,我无法理解某些内容,并且非常推荐:
Thanks. 谢谢。
First off to save your self a LOT of headache have you looked at options Like Amazon's Auora, Dynomo, Red Shift, etc services? 首先,要让自己省下很多麻烦,您是否看过诸如Amazon的Auora,Dynomo,Red Shift等服务之类的选项? They are VERY cost effective at scale, as well as optimized and managed for you.
它们在规模上非常具有成本效益,并为您优化和管理。
Actually Amazon's straight Postgress databases can handle MASSIVE amounts of reads or writes. 实际上,Amazon的直接Postgress数据库可以处理大量读取或写入。 We can go into 2,000- 6,000 IOPS on reads and another 2,000 to 6,000 IOPS in writes without issue.
读取时,我们可以达到2,000- 6,000 IOPS,写入时又可以达到2,000至6,000 IOPS。 I would really look into this as the option.
我真的会将此作为选择。 Azure, Oracle, and Google also have competing services.
Azure,Oracle和Google也提供竞争服务。
Also be aware that Postgres-XL beyond all reason has no HA support. 另请注意,Postgres-XL出于所有原因均不支持HA。 If you lose a single node you lose everything.
如果丢失单个节点,则将丢失所有内容。 The nodes can not fail over.
节点无法故障转移。
it's a standalone fork?
这是一个独立的叉子?
Yes, They are very different apps and developed separate from each other. 是的,它们是非常不同的应用程序,并且彼此独立开发。
How are Postgres and PostgresXL versions compatible?
Postgres和PostgresXL版本如何兼容?
They arn't compatible. 他们不兼容。 You can not just migration Postgres to Postgresl-XL.
您不仅可以将Postgres迁移到Postgresl-XL。 They work VERY differently.
他们的工作方式截然不同。
Generating ids with Postgres Specific algorithm makes it impossible to >move data from shard to shard
使用Postgres Specific算法生成id使得无法将数据从碎片移动到碎片
Not following this, but with sharing you are not supposed to move data from one shard to another. 不遵循这一点,但是通过共享,您不应该将数据从一个分片移动到另一个分片。 The key being used generally needs to be something specific and unique to split/segregate your data on.
通常,所使用的密钥必须是特定的且唯一的,以拆分/隔离您的数据。 Like a date, or a "type" field, or some other (hopefully ordered) field(s)/column(s).
例如日期,“类型”字段或其他(希望排序)字段/列。 This breaks things up but has obvious pain in the a$$ limitations.
这可以使事情分解,但是在a $$的限制方面显然有痛苦。
Are there any other sharding workarounds except for Citus and PostgresXL?
除Citus和PostgresXL之外,还有其他分片解决方法吗? It would be good not to change much in our database on >>migrating.
最好不要在我们的数据库中进行>> migration更改。
Tons of options, but right off the bat going from a standard RDS, to a NoSql, or MPP database is going to be a major migration, a lot of effort, and have a LOT of limitations no matter what you do. 从标准RDS到NoSql或MPP数据库的大量选择都是可行的,无论您做什么,都将是一个重大的迁移,很多工作,并且有很多限制。
Next Postress-XL and Citus are MPP (massive parallel processing) clustering apps, not sharing specifically. Next Postress-XL和Citus是MPP(大规模并行处理)群集应用程序,没有专门共享。 That is part of what they can do, but it is not their focus.
这是他们可以做的一部分,但这不是他们的重点。
Other options for MPP MPP的其他选项
pgPool -- (not great for heavy writes ) pgPool-(不适用于大量写入)
haProxy -- ( have not done it but read about it. Lost of work to setup and maintain. ) haProxy-(尚未完成但请阅读它。失去了设置和维护的工作。)
MySql Cluster -- (Huge pain to use the OSS version and major $$$ for the commercial version) MySql Cluster-(使用OSS版本和商业版本的主要使用费很大的痛苦)
Green Plumb 绿色铅锤
Teradata Teradata数据
Vertica Vertica的
what is the best solution to migrate data?
什么是迁移数据的最佳解决方案?
Very unlikely to find a simple migration for this kind of switch. 对于这种交换机,很难找到简单的迁移方法。 You can expect to likely need to export the data your self from the existing RDS and import it to the new DB and will likely have to write something your self to get it the way you want it.
您可能期望可能需要从现有RDS导出自己的数据并将其导入到新的DB,并且可能必须编写一些自己的数据才能以所需的方式获得它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.