简体繁体 English

Presto Coordinator 不支持高可用性

[英]Presto Coordinator does not have support for High Availabiltiy

原文 2020-09-02 08:46:44 0 3 nginx/ presto/ high-availability/ trino/ coordinator

Presto coordinator does not have inbuilt support for High Availability.Its a SPOF , single point of failure. Presto 协调器没有对高可用性的内置支持。它是SPOF ，单点故障。 Is there an approach to overcome this?有没有办法克服这个问题？

3 个解决方案

HA can mean multiple things. HA 可能意味着多种含义。

There is no HA for ongoing queries and Presto project provides no HA for coordinator, as this inherently needs to be tied to deployment and monitoring system.正在进行的查询没有 HA，Presto 项目没有为协调器提供 HA，因为这本质上需要绑定到部署和监控系统。

Your options include:您的选择包括：

use Starburst for deployment, it provides coordinator's quick fail over for more than a year now使用 Starburst 进行部署，它提供协调器的快速故障转移已经一年多了
AWS's EMR may also provide fail over for the coordinator (need verification) AWS 的 EMR 也可能为协调器提供故障转移（需要验证）
build it yourself自己建造

Currently these solutions are limited, they cannot help ongoing or currently queued queries from failing, so you still need a kind of retry on the client side.目前这些解决方案是有限的，它们无法帮助正在进行的或当前排队的查询失败，因此您仍然需要在客户端进行某种重试。 You can follow https://github.com/trinodb/trino/issues/455 for future improvements in Presto which would allow for more resilience.您可以关注https://github.com/trinodb/trino/issues/455以了解 Presto 的未来改进，这将允许更多的弹性。

Presto Coordinator HA setup Presto 协调器 HA 设置

( Ongoing queries will be impacted if a coordinator goes down ) （如果协调器出现故障，正在进行的查询将受到影响）

Active/Active活动/活动

Requirements要求

N+1 hostnames for the ELB. ELB 的 N+1 个主机名。
Or要么
N+1 ports on the ELB. ELB 上的 N+1 个端口。

N is the number of presto clusters. N 是 presto 集群的数量。

Clients are configured with one of the elb host names that is not used as a servername.客户端配置了不用作服务器名的 elb 主机名之一。 In the current setup, the presto.client.abc.com .在当前设置中， presto.client.abc.com 。

Presto Query protocol https://github.com/prestodb/presto/wiki/HTTP-Protocol Presto 查询协议https://github.com/prestodb/presto/wiki/HTTP-Protocol

It's a cursor based implementation.这是一个基于 cursor 的实现。 A query results in a cursor and the clients iterate the cursor. Every cursor iteration response contains a next uri to fetch the next set of results from.查询结果为 cursor，客户端迭代 cursor。每个 cursor 迭代响应都包含下一个 uri，以从中获取下一组结果。 All the next uri links for a query must be routed to the coordinator the original query was handled by.查询的所有下一个 uri 链接都必须路由到处理原始查询的协调器。

Used nginx server names to bind a query to a coordinator.使用 nginx 服务器名称将查询绑定到协调器。 Can also be setup with multiple ports ( ELB with multiple ports instead of multiple host names ).也可以设置多个端口（具有多个端口而不是多个主机名的 ELB）。

Since you asked about Prestodb, the issue of the single coordinator is being investigated to come up with a design for multiple coordinators for prestodb.由于您询问了 Prestodb，因此正在研究单个协调器的问题，以便为 prestodb 设计多个协调器。

It is a hard problem to solve given the current coordinator design.鉴于当前的协调器设计，这是一个很难解决的问题。 https://github.com/prestodb/presto/issues/3918 https://github.com/prestodb/presto/issues/3918

As you mention, using HA proxy on two coordinators is the best way to achieve some sort of coordinator HA at the moment.正如您提到的，在两个协调器上使用 HA 代理是目前实现某种协调器 HA 的最佳方式。

If you run containers in Kube.netes, K8s can detect a down pod and auto restart the coordinator to give you HA to some extent as well.如果你在 Kube.netes 中运行容器，K8s 可以检测到一个宕机的 pod 并自动重启协调器，从而在一定程度上为你提供 HA。

While AWS EMR provides multi-master environment, because Presto doesn't have support for multiple coordinators, it is currently not supported.虽然 AWS EMR 提供多主机环境，但由于 Presto 不支持多个协调器，因此目前不支持。 (that is not in the list of services that can use this feature) （不在可以使用此功能的服务列表中）