简体   繁体   English

与独立的mariaDB服务器相比,galera的性能非常差

[英]Getting a very bad performance with galera as compared to a standalone mariaDB server

I am getting an unacceptable low performance with the galera setup i created. 我创建的galera设置出现了令人无法接受的低性能。 In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer. 在我的设置中,有2个节点处于双活状态,我使用HA-proxy负载均衡器以循环方式在两个节点上进行读写。

I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration: 36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe 使用具有以下配置的单个mariadb服务器,我可以轻松地在我的应用程序上获得10000 TPS:36 vpcu,60 GB RAM,SSD,10Gig专用管道

With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy. 使用galera时,虽然我正在使用通过ha-proxy平衡的2个节点(36vcpu,60 GB RAM)的DB负载,但我几乎没有获得3500 TPS。 For information, ha-proxy is hosted as a standalone node on a different server. 有关信息,ha-proxy作为独立节点托管在其他服务器上。 I have removed ha-proxy as of now but there is no improvement in performance. 到目前为止,我已经删除了ha-proxy,但是性能没有任何改善。

Can someone please suggest some tuning parameters in my.cnf i should consider to tune this severely under-performing setup. 有人可以在my.cnf中建议一些调优参数吗,我应该考虑调优这种严重表现不佳的设置。

I am using the below my.cnf file: 我正在使用以下my.cnf文件:

在此处输入图片说明

在此处输入图片说明

I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration: 36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe 使用具有以下配置的单个mariadb服务器,我可以轻松地在我的应用程序上获得10000 TPS:36 vpcu,60 GB RAM,SSD,10Gig专用管道

With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy. 使用galera时,虽然我正在使用通过ha-proxy平衡的2个节点(36vcpu,60 GB RAM)的DB负载,但我几乎没有获得3500 TPS。

Clusters based on Galera are not designed to scale writes as I see you intend to do; 我认为,基于Galera的集群并非旨在扩展写入。 In fact, as Rick mentioned above: sending writes to multiple nodes for the same tables will end up causing certification conflicts that will reflect as deadlocks for your application, adding huge overhead. 实际上,正如Rick前面提到的:向同一表的多个节点发送写操作将最终导致认证冲突,这将反映为应用程序的死锁,从而增加了巨大的开销。

I am getting an unacceptable low performance with the galera setup i created. 我创建的galera设置出现了令人无法接受的低性能。 In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer. 在我的设置中,有2个节点处于双活状态,我使用HA-proxy负载均衡器以循环方式在两个节点上进行读写。

Please send all writes to a single node and see if that improves performane; 请将所有写入发送到单个节点,看看是否可以提高性能; There will always be some overhead due to the nature of virtually-synchronous replication that Galera uses, which literally adds network overhead to each write you perform (albeit true clock-based parallel replication will offset this impact quite a bit, still you are bound to see slightly lower throughput volumes). 由于Galera使用的虚拟同步复制的性质,总会有一些开销,这实际上会增加您执行的每次写入的网络开销(尽管基于真正的基于时钟的并行复制将在很大程度上抵消这种影响,但是您仍然必须看到吞吐量略低)。

Also make sure to keep your transactions short and COMMIT as soon as you are done with an atomic unit of work, since replication-certification process is single-threaded and will stall writes on the other nodes (if you see that your writer node shows transactions wsrep pre-commit stage that means the other nodes are doing certification for a large transaction or that the node is suffering performance problems of some sort -swap, full disk, abusively large reads, etc. 此外,还必须确保在完成原子工作单元后尽快使事务简短,并保持COMMIT状态,因为复制证书过程是单线程的,并且会阻止其他节点上的写入(如果您看到writer节点显示事务) wsrep提交前阶段,这意味着其他节点正在为大型事务进行认证,或者该节点正遭受某种性能问题-交换,全盘,滥用大量读取等。

Hope that helps, and let us know how it goes when you move to single node. 希望能对您有所帮助,并让我们知道您移至单个节点时的情况。

Turn off the QC: 关闭质量控制:

query_cache_size = 0  -- not 22 bytes
query_cache_type = OFF -- QC is incompatible with Galera

Increase innodb_io_capacity 增加innodb_io_capacity

How far apart (ping time) are the two nodes? 两个节点相距多远(ping时间)?

Suggest you pretend that it is Master-Slave. 建议您假装它是从属设备。 That is, have HAProxy send all traffic to one node, leaving the other as a hot backup. 也就是说,让HAProxy将所有流量发送到一个节点,而将另一个作为热备份。 Certain things can run faster in this mode; 在这种模式下,某些事情可以运行得更快。 I don't know about your app. 我不了解您的应用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM