简体繁体 English

与独立的mariaDB服务器相比，galera的性能非常差

[英]Getting a very bad performance with galera as compared to a standalone mariaDB server

原文 2017-01-12 10:25:59 1 2 mysql/ mariadb/ database-administration/ percona/ galera

I am getting an unacceptable low performance with the galera setup i created. 我创建的galera设置出现了令人无法接受的低性能。 In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer. 在我的设置中，有2个节点处于双活状态，我使用HA-proxy负载均衡器以循环方式在两个节点上进行读写。

I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration: 36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe 使用具有以下配置的单个mariadb服务器，我可以轻松地在我的应用程序上获得10000 TPS：36 vpcu，60 GB RAM，SSD，10Gig专用管道

With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy. 使用galera时，虽然我正在使用通过ha-proxy平衡的2个节点（36vcpu，60 GB RAM）的DB负载，但我几乎没有获得3500 TPS。 For information, ha-proxy is hosted as a standalone node on a different server. 有关信息，ha-proxy作为独立节点托管在其他服务器上。 I have removed ha-proxy as of now but there is no improvement in performance. 到目前为止，我已经删除了ha-proxy，但是性能没有任何改善。

Can someone please suggest some tuning parameters in my.cnf i should consider to tune this severely under-performing setup. 有人可以在my.cnf中建议一些调优参数吗，我应该考虑调优这种严重表现不佳的设置。

I am using the below my.cnf file: 我正在使用以下my.cnf文件：

2 个解决方案

I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration: 36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe 使用具有以下配置的单个mariadb服务器，我可以轻松地在我的应用程序上获得10000 TPS：36 vpcu，60 GB RAM，SSD，10Gig专用管道

With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy. 使用galera时，虽然我正在使用通过ha-proxy平衡的2个节点（36vcpu，60 GB RAM）的DB负载，但我几乎没有获得3500 TPS。

Clusters based on Galera are not designed to scale writes as I see you intend to do; 我认为，基于Galera的集群并非旨在扩展写入。 In fact, as Rick mentioned above: sending writes to multiple nodes for the same tables will end up causing certification conflicts that will reflect as deadlocks for your application, adding huge overhead. 实际上，正如Rick前面提到的：向同一表的多个节点发送写操作将最终导致认证冲突，这将反映为应用程序的死锁，从而增加了巨大的开销。

I am getting an unacceptable low performance with the galera setup i created. 我创建的galera设置出现了令人无法接受的低性能。 In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer. 在我的设置中，有2个节点处于双活状态，我使用HA-proxy负载均衡器以循环方式在两个节点上进行读写。

Please send all writes to a single node and see if that improves performane; 请将所有写入发送到单个节点，看看是否可以提高性能； There will always be some overhead due to the nature of virtually-synchronous replication that Galera uses, which literally adds network overhead to each write you perform (albeit true clock-based parallel replication will offset this impact quite a bit, still you are bound to see slightly lower throughput volumes). 由于Galera使用的虚拟同步复制的性质，总会有一些开销，这实际上会增加您执行的每次写入的网络开销（尽管基于真正的基于时钟的并行复制将在很大程度上抵消这种影响，但是您仍然必须看到吞吐量略低）。

Also make sure to keep your transactions short and COMMIT as soon as you are done with an atomic unit of work, since replication-certification process is single-threaded and will stall writes on the other nodes (if you see that your writer node shows transactions wsrep pre-commit stage that means the other nodes are doing certification for a large transaction or that the node is suffering performance problems of some sort -swap, full disk, abusively large reads, etc. 此外，还必须确保在完成原子工作单元后尽快使事务简短，并保持COMMIT状态，因为复制证书过程是单线程的，并且会阻止其他节点上的写入（如果您看到writer节点显示事务） wsrep提交前阶段，这意味着其他节点正在为大型事务进行认证，或者该节点正遭受某种性能问题-交换，全盘，滥用大量读取等。

Hope that helps, and let us know how it goes when you move to single node. 希望能对您有所帮助，并让我们知道您移至单个节点时的情况。

Turn off the QC: 关闭质量控制：

query_cache_size = 0  -- not 22 bytes
query_cache_type = OFF -- QC is incompatible with Galera

Increase innodb_io_capacity 增加innodb_io_capacity

How far apart (ping time) are the two nodes? 两个节点相距多远（ping时间）？

Suggest you pretend that it is Master-Slave. 建议您假装它是从属设备。 That is, have HAProxy send all traffic to one node, leaving the other as a hot backup. 也就是说，让HAProxy将所有流量发送到一个节点，而将另一个作为热备份。 Certain things can run faster in this mode; 在这种模式下，某些事情可以运行得更快。 I don't know about your app. 我不了解您的应用。