简体   繁体   English

如何根据特定的资源路径执行尽可能公平的负载均衡

[英]How to perform an as fair as possible load balancing based on specific resource paths

I have an application that serves artifacts from files (pages from PDF files as images), the original PDF files live on S3 and they are downloaded to the servers that generate the images when a client hits one of them.我有一个应用程序提供来自文件的工件(来自 PDF 文件的页面作为图像),原始 PDF 文件位于 S3 上,当客户端点击其中一个时,它们被下载到生成图像的服务器。 These machines have a local caching mechanism that guarantees that each PDF file is downloaded only once.这些机器有一个本地缓存机制,保证每个 PDF 文件只下载一次。

So, when a client comes with a request give me page 1 of pdf 123.pdf this cache is checked, if there is no pdf file in there, it's downloaded from S3 and stored at the local cache and then a process generates this page 1 and sends the image back to the client.因此,当客户带来请求时,请给我第1页,Z437175BA4191210EE004EE1D937494D09Z 123.Z437175BA4191210EE004E1D93749494D09Z此cache the the Z437175bA并将图像发送回客户端。

The client itself does not know it's connected to a special server, it all looks like it's just accessing the website server, but, for the sake of performance I would like to make sure this client is always going to be directed to the same file server that served it's first request (and downloaded the file from S3).客户端本身不知道它连接到一个特殊的服务器,看起来它只是访问网站服务器,但是,为了性能,我想确保这个客户端总是被定向到同一个文件服务器服务于它的第一个请求(并从 S3 下载了文件)。

I could just set a cookie on the client to get him to always download from that specific file server, but placing this on the client leads to unfair usage, as some users are going to open many documents and some are not so I would like to perform this load balancing at the resource level (PDF document).我可以在客户端上设置一个 cookie,让他总是从那个特定的文件服务器下载,但是把它放在客户端上会导致不公平的使用,因为有些用户会打开很多文档,而有些则不会,所以我想在资源级别执行此负载平衡(PDF 文档)。

Each document has a unique identification (integer primary key in the database) and my first solution was using Redis and storing the document id as a key and the value is the host of the server machine that currently has this document cached, but I would like to remove Redis or look for a simpler way to implement this that would not require looking for keys somewhere else.每个文档都有一个唯一的标识(数据库中的整数主键),我的第一个解决方案是使用 Redis 并将文档 ID 作为键存储,值是当前缓存此文档的服务器计算机的主机,但我想删除 Redis 或寻找一种更简单的方法来实现这一点,而无需在其他地方寻找密钥。

Also, it would be nice if the defined algorithm or idea would allow for adding more file servers on the fly.此外,如果定义的算法或想法允许动态添加更多文件服务器,那就太好了。

What would be the best way to perform this kind of load balancing with affinity based on resources?基于资源的亲和性执行这种负载平衡的最佳方法是什么?

Just for the sake of saying, this app is a mix of Ruby, java and Scala.顺便说一句,这个应用程序是 Ruby、java 和 Scala 的混合体。

I'd use the following approach in the load balancer:我会在负载均衡器中使用以下方法:

  • Strip the requested resource URL to remove the query and fragment parts.剥离请求的资源 URL 以删除查询和片段部分。
  • Turn the stripped URL into a String and take its hashcode.将剥离的 URL 转换为字符串并获取其哈希码。
  • Use the hashcode to select the back end server from the list of available servers;使用可用服务器列表中的后端服务器 select 的哈希码; eg例如

    String[] serverNames =... String serverName = serverNames[hash % serverNames.length];

This spreads the load evenly across all servers, and always sends the same request to the same server.这会将负载均匀地分布在所有服务器上,并始终将相同的请求发送到同一台服务器。 If you add more servers, it adjusts itself... though you take a performance hit while the caching warms up again.如果您添加更多服务器,它会自行调整......尽管在缓存再次预热时您会受到性能影响。

I don't think you want to aim for "fairness";我认为您不想以“公平”为目标; ie some kind of guarantee that each requests takes roughly the same time.即某种保证每个请求花费大致相同的时间。 To achieve fairness you need to actively monitor the load on each backend and dispatch according to load.为了实现公平,您需要主动监控每个后端的负载并根据负载进行调度。 That's going to (somewhat) negate the caching / affinity, and is going to consume resources to do the measurement and load-balancing decision making.这将(在某种程度上)否定缓存/亲和性,并将消耗资源来进行测量和负载平衡决策。 A dumb load spreading approach (eg my suggestion) should give you better throughput overall for your use-case.一种愚蠢的负载分散方法(例如我的建议)应该为您的用例提供更好的整体吞吐量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM