简体   繁体   中英

Presto Coordinator does not have support for High Availabiltiy

Presto coordinator does not have inbuilt support for High Availability.Its a SPOF , single point of failure. Is there an approach to overcome this?

HA can mean multiple things.

There is no HA for ongoing queries and Presto project provides no HA for coordinator, as this inherently needs to be tied to deployment and monitoring system.

Your options include:

  • use Starburst for deployment, it provides coordinator's quick fail over for more than a year now
  • AWS's EMR may also provide fail over for the coordinator (need verification)
  • build it yourself

Currently these solutions are limited, they cannot help ongoing or currently queued queries from failing, so you still need a kind of retry on the client side. You can follow https://github.com/trinodb/trino/issues/455 for future improvements in Presto which would allow for more resilience.

Presto Coordinator HA setup

( Ongoing queries will be impacted if a coordinator goes down )

Active/Active

Requirements

  • N+1 hostnames for the ELB.

    Or

  • N+1 ports on the ELB.

N is the number of presto clusters.

在此处输入图像描述

Clients are configured with one of the elb host names that is not used as a servername. In the current setup, the presto.client.abc.com .

Presto Query protocol https://github.com/prestodb/presto/wiki/HTTP-Protocol

It's a cursor based implementation. A query results in a cursor and the clients iterate the cursor. Every cursor iteration response contains a next uri to fetch the next set of results from. All the next uri links for a query must be routed to the coordinator the original query was handled by.

Used nginx server names to bind a query to a coordinator. Can also be setup with multiple ports ( ELB with multiple ports instead of multiple host names ).

Since you asked about Prestodb, the issue of the single coordinator is being investigated to come up with a design for multiple coordinators for prestodb.

It is a hard problem to solve given the current coordinator design. https://github.com/prestodb/presto/issues/3918

As you mention, using HA proxy on two coordinators is the best way to achieve some sort of coordinator HA at the moment.

If you run containers in Kube.netes, K8s can detect a down pod and auto restart the coordinator to give you HA to some extent as well.

While AWS EMR provides multi-master environment, because Presto doesn't have support for multiple coordinators, it is currently not supported. (that is not in the list of services that can use this feature)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM