Akka Http 性能調優

Question

我正在 Akka-http 框架（版本：10.0）上執行負載測試，我正在使用wrk工具。 wrk 命令：

wrk -t6 -c10000 -d 60s --timeout 10s --latency http://localhost:8080/hello

第一次運行沒有任何阻塞調用，

object WebServer {

  implicit val system = ActorSystem("my-system")
  implicit val materializer = ActorMaterializer()
  implicit val executionContext = system.dispatcher
  def main(args: Array[String]) {


    val bindingFuture = Http().bindAndHandle(router.route, "localhost", 8080)

    println(
      s"Server online at http://localhost:8080/\nPress RETURN to stop...")
    StdIn.readLine() // let it run until user presses return
    bindingFuture
      .flatMap(_.unbind()) // trigger unbinding from the port
      .onComplete(_ => system.terminate()) // and shutdown when done
  }
}

object router {
  implicit val executionContext = WebServer.executionContext


  val route =
    path("hello") {
      get {
        complete {
        "Ok"
        }
      }
    }
}

wrk的輸出：

    Running 1m test @ http://localhost:8080/hello
  6 threads and 10000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.22ms   16.41ms   2.08s    98.30%
    Req/Sec     9.86k     6.31k   25.79k    62.56%
  Latency Distribution
     50%    3.14ms
     75%    3.50ms
     90%    4.19ms
     99%   31.08ms
  3477084 requests in 1.00m, 477.50MB read
  Socket errors: connect 9751, read 344, write 0, timeout 0
Requests/sec:  57860.04
Transfer/sec:      7.95MB

現在，如果我在路由中添加一個未來的調用並再次運行測試。

val route =
    path("hello") {
      get {
        complete {
          Future { // Blocking code
            Thread.sleep(100)
            "OK"
          }
        }
      }
    }

wrk 的輸出：

Running 1m test @ http://localhost:8080/hello
  6 threads and 10000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   527.07ms  491.20ms  10.00s    88.19%
    Req/Sec    49.75     39.55   257.00     69.77%
  Latency Distribution
     50%  379.28ms
     75%  632.98ms
     90%    1.08s 
     99%    2.07s 
  13744 requests in 1.00m, 1.89MB read
  Socket errors: connect 9751, read 385, write 38, timeout 98
Requests/sec:    228.88
Transfer/sec:     32.19KB

正如您在未來調用中看到的那樣，只有13744 個請求正在被處理。

在遵循Akka 文檔之后，我為創建最多200 個線程的路由添加了一個單獨的調度程序線程池。

implicit val executionContext = WebServer.system.dispatchers.lookup("my-blocking-dispatcher")
// config of dispatcher
my-blocking-dispatcher {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    // or in Akka 2.4.2+
    fixed-pool-size = 200
  }
  throughput = 1
}

經過上面的改動，性能有所提升

Running 1m test @ http://localhost:8080/hello
  6 threads and 10000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   127.03ms   21.10ms 504.28ms   84.30%
    Req/Sec   320.89    175.58   646.00     60.01%
  Latency Distribution
     50%  122.85ms
     75%  135.16ms
     90%  147.21ms
     99%  190.03ms
  114378 requests in 1.00m, 15.71MB read
  Socket errors: connect 9751, read 284, write 0, timeout 0
Requests/sec:   1903.01
Transfer/sec:    267.61KB

在my-blocking-dispatcher 配置中，如果我將池大小增加到 200 以上，性能是相同的。

現在，我應該使用哪些其他參數或配置來提高性能，同時使用未來的調用。因此該應用程序可提供最大吞吐量。

Answer 1

首先免責聲明：我以前沒有使用過wrk工具，所以我可能會出錯。 以下是我為這個答案所做的假設：

連接數與線程數無關，即如果我指定-t4 -c10000它會保持 10000 個連接，而不是 4 * 10000。
對於每個連接，行為如下：它發送請求，完全接收響應，然后立即發送下一個，依此類推，直到時間用完。

另外我和 wrk 在同一台機器上運行服務器，我的機器似乎比你的弱（我只有雙核 CPU），所以我將 wrk 的線程數減少到 2，連接數減少到 1000，以獲得體面的結果。

我使用的 Akka Http 版本是10.0.1 ，而 wrk 版本是4.0.2 。

現在來回答。 讓我們看看您擁有的阻塞代碼：

Future { // Blocking code
  Thread.sleep(100)
  "OK"
}

這意味着，每個請求至少需要 100 毫秒。 如果您有 200 個線程和 1000 個連接，時間線將如下所示：

Msg: 0       200      400      600      800     1000     1200      2000
     |--------|--------|--------|--------|--------|--------|---..---|---...
Ms:  0       100      200      300      400      500      600      1000

其中Msg是已處理消息的數量， Ms是以毫秒為單位的經過時間。

這使我們每秒處理 2000 條消息，或每 30 秒處理約 60000 條消息，這與測試數據基本一致：

wrk -t2 -c1000 -d 30s --timeout 10s --latency http://localhost:8080/hello
Running 30s test @ http://localhost:8080/hello
  2 threads and 1000 connections
  Thread Stats   Avg     Stdev     Max   +/- Stdev
    Latency   412.30ms   126.87ms 631.78ms   82.89%
    Req/Sec     0.95k    204.41     1.40k    75.73%
  Latency Distribution
     50%  455.18ms
     75%  512.93ms
     90%  517.72ms
     99%  528.19ms
here: --> 56104 requests in 30.09s <--, 7.70MB read
  Socket errors: connect 0, read 1349, write 14, timeout 0
Requests/sec:   1864.76
Transfer/sec:    262.23KB

很明顯，這個數字（每秒 2000 條消息）受到線程數的嚴格限制。 例如，如果我們有 300 個線程，我們將每 100 毫秒處理 300 條消息，那么如果我們的系統可以處理這么多線程，我們每秒將有 3000 條消息。 讓我們看看如果我們為每個連接提供 1 個線程，即池中的 1000 個線程，我們會怎樣：

wrk -t2 -c1000 -d 30s --timeout 10s --latency http://localhost:8080/hello
Running 30s test @ http://localhost:8080/hello
  2 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   107.08ms   16.86ms 582.44ms   97.24%
    Req/Sec     3.80k     1.22k    5.05k    79.28%
  Latency Distribution
     50%  104.77ms
     75%  106.74ms
     90%  110.01ms
     99%  155.24ms
  223751 requests in 30.08s, 30.73MB read
  Socket errors: connect 0, read 1149, write 1, timeout 0
Requests/sec:   7439.64
Transfer/sec:      1.02MB

如您所見，現在一個請求平均需要幾乎 100 毫秒，即與我們放入Thread.sleep的相同數量。 看來我們不能比這更快了！ 現在我們幾乎處於one thread per request標准情況下，這種情況已經運行了很多年，直到異步 IO 讓服務器擴展得更高。

為了比較，這里是我的機器上使用默認 fork-join 線程池的完全非阻塞測試結果：

complete {
  Future {
    "OK"
  }
}

====>

wrk -t2 -c1000 -d 30s --timeout 10s --latency http://localhost:8080/hello
Running 30s test @ http://localhost:8080/hello
  2 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    15.50ms   14.35ms 468.11ms   93.43%
    Req/Sec    22.00k     5.99k   34.67k    72.95%
  Latency Distribution
     50%   13.16ms
     75%   18.77ms
     90%   25.72ms
     99%   66.65ms
  1289402 requests in 30.02s, 177.07MB read
  Socket errors: connect 0, read 1103, write 42, timeout 0
Requests/sec:  42946.15
Transfer/sec:      5.90MB

總而言之，如果您使用阻塞操作，則每個請求需要一個線程來實現最佳吞吐量，因此請相應地配置您的線程池。 您的系統可以處理的線程數有自然限制，您可能需要調整操作系統以獲得最大線程數。 為獲得最佳吞吐量，請避免阻塞操作。

也不要將異步操作與非阻塞操作混淆。 您的Future和Thread.sleep代碼是異步但阻塞操作的完美示例。 許多流行的軟件都在這種模式下運行（一些傳統的 HTTP 客戶端、Cassandra 驅動程序、AWS Java SDK 等）。 要充分利用非阻塞 HTTP 服務器的好處，您需要一直保持非阻塞，而不僅僅是異步。 這可能並不總是可能的，但它是值得努力的。

Answer 2

我使用此配置在我的本地主機上獲得 x3 性能：

akka {
  actor {
    default-dispatcher {
      fork-join-executor {
        parallelism-min = 1
        parallelism-max = 64
        parallelism-factor = 1
      }
      throughput = 64
    }
  }

  http {
    host-connection-pool {
      max-connections = 10000
      max-open-requests = 4096
    }

    server {
      pipelining-limit = 1024
      max-connections = 4096
      backlog = 1024
    }
  }
}

也許這些參數的其他值會更好（如果是，請寫信給我）。

Akka Http 版本 10.1.12。

Akka Http 性能調優

問題描述

2 個解決方案

解決方案1
29 已采納 2016-12-23 16:13:34

解決方案2
0 2020-06-25 21:05:56

Akka Http 性能調優

問題描述

2 個解決方案

解決方案1 29 已采納 2016-12-23 16:13:34

解決方案2 0 2020-06-25 21:05:56

解決方案1
29 已采納 2016-12-23 16:13:34

解決方案2
0 2020-06-25 21:05:56