简体   繁体   English

水平扩展 WebSocket 服务器上的负载平衡套接字?

[英]Load balancing sockets on a horizontally scaling WebSocket server?

Every few months when thinking through a personal project that involves sockets I find myself having the question of "How would you properly load balance sockets on a dynamic horizontally scaling WebSocket server?"每隔几个月在思考一个涉及套接字的个人项目时,我就会发现自己有一个问题: “您如何在动态水平扩展的 WebSocket 服务器上正确地对套接字进行负载平衡?”

I understand the theory behind horizontally scaling the WebSockets and using pub/sub models to get data to the right server that holds the socket connection for a specific user.我了解水平扩展 WebSockets 和使用发布/订阅模型将数据发送到为特定用户保存套接字连接的正确服务器背后的理论。 I think I understand ways to effectively identify the server with the fewest current socket connections that I would want to route a new socket connection too.我想我了解有效识别具有最少当前套接字连接的服务器的方法,我也想路由新的套接字连接。 What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.我不明白的是如何有效地将新的套接字连接路由到您选择的具有低套接字数的服务器。

I don't imagine this answer would be tied to a specific server implementation, but rather could be applied to most servers.我不认为这个答案会与特定的服务器实现相关联,而是可以应用于大多数服务器。 I could easily see myself implementing this with vert.x, node.js, or even perfect.我可以很容易地看到自己用 vert.x、node.js 甚至完美的方式实现了这一点。

First off, you need to define the bounds of the problem you're asking about.首先,您需要定义您所询问的问题的界限。 If you're truly talking about dynamic horizontal scaling where you spin up and down servers based on total load, then that's an even more involved problem than just figuring out where to route the latest incoming new socket connection.如果您真的在谈论动态水平扩展,您可以根据总负载启动和关闭服务器,那么这是一个更复杂的问题,而不仅仅是找出路由最新传入的新套接字连接的位置。

To solve that problem, you have to have a way of "moving" a socket from one host to another so you can clear connections from a host that you want to spin down (I'm assuming here that true dynamic scaling goes both up and down).为了解决这个问题,你必须有一种方法将一个套接字从一个主机“移动”到另一个主机,这样你就可以清除你想要降速的主机的连接(我在这里假设真正的动态缩放会上升和下降)下)。 The usual way I've seen that done is by engaging a cooperating client where you tell the client to reconnect and when it reconnects it is load balanced onto a different server so you can clear off the one you wanted to spin down.我看到这样做的通常方法是让一个合作的客户端参与进来,在那里你告诉客户端重新连接,当它重新连接时,它会负载平衡到不同的服务器上,这样你就可以清除你想要降速的服务器。 If your client has auto-reconnect logic already (like socket.io does), you can just have the server close the connection and the client will automatically re-connect.如果你的客户端已经有自动重新连接的逻辑(就像 socket.io 那样),你可以让服务器关闭连接,客户端会自动重新连接。

As for load balancing the incoming client connections, you have to decide what load metric you want to use.至于传入客户端连接的负载平衡,您必须决定要使用的负载指标。 Ultimately, you need a score for each server process that tells you how "busy" you think it is so you can put new connections on the least busy server.最终,您需要为每个服务器进程打分,告诉您您认为它有多“忙”,以便您可以在最不忙的服务器上建立新连接。 A rudimentary score would just be number of current connections.一个基本的分数只是当前连接的数量。 If you have large numbers of connections per server process (tens of thousands) and there's no particular reason in your app that some might be lots more busy than others, then the law of large numbers probably averages out the load so you could get away with just how many connections each server has.如果您的每个服务器进程有大量连接(数以万计),并且您的应用程序中没有特别的原因表明某些连接可能比其他的更忙,那么大数定律可能会平均负载,因此您可以逃脱每个服务器有多少连接。 If the use of connections is not that fair or even, then you may have to also factor in some sort of time moving average of the CPU load along with the total number of connections.如果连接的使用不是那么公平甚至不公平,那么您可能还必须考虑 CPU 负载的某种时间移动平均值以及连接总数。

If you're going to load balance across multiple physical servers, then you will need a load balancer or proxy service that everyone connects to initially and that proxy can look at the metrics for all currently running servers in the pool and assign the connection to the one with the most lowest current score.如果您要在多个物理服务器之间进行负载平衡,那么您将需要一个每个人最初都连接到的负载平衡器或代理服务,并且该代理可以查看池中所有当前正在运行的服务器的指标并将连接分配给当前得分最低的一个。 That can either be done with a proxy scheme or (more scalable) via a redirect so the proxy gets out of the way after the initial assignment.这可以通过代理方案或(更具可扩展性)通过重定向来完成,以便代理在初始分配后避开。

You could then also have a process that regularly examines your load score (however you decided to calculate it) on all the servers in the cluster and decides when to spin a new server up or when to spin one down or when things are too far out of balance on a given server and that server needs to be told to kick several connections off, forcing them to rebalance.然后,您还可以有一个过程来定期检查集群中所有服务器上的负载分数(无论您决定如何计算它),并决定何时启动新服务器或何时关闭一个服务器或何时事情太远了给定服务器上的平衡,并且需要告诉该服务器关闭多个连接,迫使它们重新平衡。

What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.我不明白的是如何有效地将新的套接字连接路由到您选择的具有低套接字数的服务器。

As described above, you either use a proxy scheme or a redirect scheme.如上所述,您可以使用代理方案或重定向方案。 At a slightly higher cost at connection time, I favor the redirect scheme because it's more scalable when running and creates fewer points of failure for an existing connection.在连接时成本稍高的情况下,我更喜欢重定向方案,因为它在运行时更具可扩展性,并且为现有连接创建的故障点更少。 All clients connect to your incoming connection gateway server which is responsible for knowing the current load score for each of the servers in the farm and based on that, it assigns an incoming connection to the host with the lowest score and this new connection is then redirected to reconnect to one of the specific servers in your farm.所有客户端都连接到您的传入连接网关服务器,该服务器负责了解场中每台服务器的当前负载分数,并根据该分数将传入连接分配给得分最低的主机,然后重定向此新连接重新连接到您场中的特定服务器之一。


I have also seen load balancing done purely by a custom DNS implementation.我还看到过纯粹由自定义 DNS 实现完成的负载平衡。 Client requests IP address for farm.somedomain.com and that custom DNS server gives them the IP address of the host it wants them assigned to.客户端请求farm.somedomain.com IP 地址,并且该自定义 DNS 服务器为他们提供了希望分配给他们的主机的 IP 地址。 Each client that looks up the IP address for farm.somedomain.com may get a different IP address.查找farm.somedomain.com的 IP 地址的每个客户端可能会获得不同的 IP 地址。 You spin hosts up or down by adding or removing them from the custom DNS server and it is that custom DNS server that has to contain the logic for knowing the load balancing logic and the current load scores of all the running hosts.您可以通过在自定义 DNS 服务器中添加或删除主机来启动或关闭主机,而自定义 DNS 服务器必须包含用于了解负载平衡逻辑和所有正在运行的主机的当前负载分数的逻辑。

Route the websocket requests to a load balancer that makes the decision about where to send the connections.将 websocket 请求路由到负载均衡器,负载均衡器决定将连接发送到哪里。

As an example, HAProxy has a leastconn method for long connections that picks the least recently used server with the lowest connection count.例如, HAProxy有一个用于长连接的leastconn方法,该方法选择连接数最少的最近最少使用的服务器。

The HAProxy backend server weightings can also be modified by external inputs , @jfriend00 detailed the technicalities of weighting in their answer . HAProxy 后端服务器权重也可以通过外部输入修改,@jfriend00在他们的回答中详细介绍了权重的技术细节。

I found this project that might be useful: https://github.com/apundir/wsbalancer我发现这个项目可能有用: https : //github.com/apundir/wsbalancer

A snippet from the description:描述中的一个片段:

Websocket balancer is a stateful reverse proxy for websockets. Websocket 平衡器是一个有状态的 websockets 反向代理。 It distributes incoming websockets across multiple available backends.它在多个可用的后端分发传入的 websocket。 In addition to load balancing, the balancer also takes care of transparently switching from one backend to another in case of mid session abnormal failure.除了负载均衡之外,均衡器还负责在会话中途异常故障的情况下从一个后端透明地切换到另一个后端。 During this failover, the remote client connection is retained as-is thus remote client do not even see this failover.在此故障转移期间,远程客户端连接保持原样,因此远程客户端甚至看不到此故障转移。 Every attempt is made to ensure none of the message is dropped during this failover.尽一切努力确保在此故障转移期间不会丢弃任何消息。

Regarding your question : that new connection will be routed by the load balancer if configured to do so.关于您的问题:如果配置为这样做,则该新连接将由负载均衡器路由。

As @Matt mentioned, for example with HAProxy using the leastconn option.正如@Matt 所提到的,例如使用 leastconn 选项的 HAProxy。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM