简体   繁体   English

为什么在Nginx中增加worker_connections会使node.js集群中的应用程序变慢?

[英]Why increasing worker_connections in Nginx makes the application slower in node.js cluster?

I'm transforming my application to node.js cluster which I hope it would boost the performance of my application. 我正在将我的应用程序转换为node.js集群,我希望它可以提高我的应用程序的性能。

Currently, I'm deploying the application to 2 EC2 t2.medium instances. 目前,我正在将应用程序部署到2个EC2 t2.medium实例。 I have Nginx as a proxy and ELB. 我有Nginx作为代理和ELB。

This is my express cluster application which is pretty standard from the documentation. 这是我的快速集群应用程序,它是文档中非常标准的。

var bodyParser = require('body-parser');
var cors = require('cors');
var cluster = require('cluster');
var debug = require('debug')('expressapp');

if(cluster.isMaster) {
  var numWorkers = require('os').cpus().length;
  debug('Master cluster setting up ' + numWorkers + ' workers');

  for(var i = 0; i < numWorkers; i++) {
    cluster.fork();
  }

  cluster.on('online', function(worker) {
    debug('Worker ' + worker.process.pid + ' is online');
  });

  cluster.on('exit', function(worker, code, signal) {
    debug('Worker ' + worker.process.pid + ' died with code: ' + code + ', and signal: ' + signal);
    debug('Starting a new worker');
    cluster.fork();  
  });
} else {
  // Express stuff
}

This is my Nginx configuration. 这是我的Nginx配置。

nginx::worker_processes: "%{::processorcount}"
nginx::worker_connections: '1024'
nginx::keepalive_timeout: '65'

I have 2 CPUs on Nginx server. 我在Nginx服务器上有2个CPU。

This is my before performance. 这是我之前的表现。

在此输入图像描述

I get 1,500 request/s which is pretty good. 我得到1,500请求/ s,这是非常好的。 Now I thought I would increase the number of connections on Nginx so I can accept more requests. 现在我想我会增加Nginx上的连接数,这样我就可以接受更多的请求了。 I do this. 我这样做

nginx::worker_processes: "%{::processorcount}"
nginx::worker_connections: '2048'
nginx::keepalive_timeout: '65'

And this is my after performance. 这是我的表演后。

在此输入图像描述

Which I think it's worse than before. 我认为这比以前更糟糕。

I use gatling for performance testing and here's the code. 我使用gatling进行性能测试,这是代码。

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class LoadTestSparrowCapture extends Simulation {
  val httpConf = http
    .baseURL("http://ELB")
    .acceptHeader("application/json")
    .doNotTrackHeader("1")
    .acceptLanguageHeader("en-US,en;q=0.5")
    .acceptEncodingHeader("gzip, defalt")
    .userAgentHeader("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:16.0) Gecko/20100101 Firefox/16.0")

    val headers_10 = Map("Content-Type" -> "application/json")

    val scn = scenario("Load Test")
      .exec(http("request_1")
        .get("/track"))

    setUp(
      scn.inject(
        atOnceUsers(15000)
      ).protocols(httpConf))
}

I deployed this to my gatling cluster. 我把它部署到我的gatling集群。 So, I have 3 EC2 instances firing 15,000 requests in 30s to my application. 所以,我有3个EC2实例在30秒内向我的应用程序发出15,000个请求。

The question is, is there anything I can do to increase my performance of my application or I just need to add more machines? 问题是,有什么办法可以提高我的应用程序性能,或者我只需要添加更多的机器?

The route that I'm testing is pretty simple, I get the request and send it off to RabbitMQ so it can be processed further. 我正在测试的路线非常简单,我收到请求并将其发送到RabbitMQ,以便进一步处理。 So, the response of that route is pretty fast. 所以,那条路线的反应非常快。

You've mentioned that you are using AWS and in the front of your EC2 instances in ELB. 您已经提到过您正在使用AWS并在ELB中的EC2实例前面。 As I see you are getting 502 and 503 status codes. 我看到你得到502和503状态代码。 These can be sent from ELB or your EC2 instances. 这些可以从ELB或您的EC2实例发送。 Make sure that when doing the load-test you know from where the errors are coming from. 确保在进行负载测试时,您知道错误来自何处。 You can check this in AWS console in ELB CloudWatch metrics . 您可以在AWS控制台中使用ELB CloudWatch指标进行检查

Basically HTTPCode_ELB_5XX means your ELB sent 50x. 基本上HTTPCode_ELB_5XX意味着您的ELB发送了50x。 On other hand HTTPCode_Backend_5XX sent 50x. 另一方面HTTPCode_Backend_5XX发送50x。 You can also verify that in the logs of ELB. 您还可以在ELB的日志中验证。 Better explanation of errors of ELB you can find here . 你可以在这里找到对ELB错误的更好解释。

To load-test on AWS you should definitely read this . 要在AWS上进行负载测试,您一定要阅读此内容 Point is that ELB is just another set of machines, which needs to scale if your load increases. 重点是ELB只是另一组机器,如果负载增加需要扩展。 Default scaling strategy is (cited from the section "Ramping Up Testing"): 默认缩放策略是(引用“Ramping Up Testing”一节):

Once you have a testing tool in place, you will need to define the growth in the load. 一旦有了测试工具,就需要定义负载的增长。 We recommend that you increase the load at a rate of no more than 50 percent every five minutes. 我们建议您每五分钟以不超过50%的速度增加负载。

That means when you start at some number of concurrent users, lets say 1000, per default you should increase only up to 1500 within 5 minutes. 这意味着当您从一些并发用户开始时,假设1000,默认情况下,您应该在5分钟内最多增加1500。 This will guarantee that ELB will scale with load on your servers. 这将保证ELB随着服务器的负载而扩展。 Exact numbers may vary and you have to test them on your own. 确切的数字可能会有所不同,您必须自己测试它们。 Last time I've tested it sustained load of 1200 req./sw/o an issue and then I've started to receive 50x. 上次我测试了1200 req./sw/o的持续负载问题,然后我开始接收50x。 You can test it easily running ramp-up scenario from X to Y users from single client and waiting for 50x. 您可以从单个客户端轻松地从X到Y用户运行加速场景并等待50x。

Next very important thing (from part "DNS Resoultion") is: 下一个非常重要的事情(从“DNS Resoultion”部分)是:

If clients do not re-resolve the DNS at least once per minute, then the new resources Elastic Load Balancing adds to DNS will not be used by clients. 如果客户端每分钟至少重新解析一次DNS,则客户端将不会使用Elastic Load Balancing添加到DNS的新资源。

In short it means that you have to guarantee that TTL in DNS is respected, or that your clients re-resolve and rotate DNS IPs which they received by doing DNS lookup to guarantee round-robin fashion to distributing load. 简而言之,这意味着您必须保证DNS中的TTL得到尊重,或者您的客户端通过执行DNS查找来重新解析并轮换他们收到的DNS IP,以保证循环方式以分配负载。 If not (eg testing from only one client, not your case) you can skew the results by overloading one instance of ELB by targeting all the traffic only to one instance. 如果不是(例如,仅从一个客户端进行测试,而不是您的情况),您可以通过将所有流量仅定位到一个实例来重载ELB实例来扭曲结果。 That means ELB will not scale at all. 这意味着ELB根本不会扩展。

Hope it will help. 希望它会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM