用於產生Kafka Performance的Akka HTTP REST API

Question

我正在使用Akka構建一個API，該API可以生成Kafka總線。 我一直在使用Gatling對應用程序進行負載測試。 注意，在加特林創建了1000個以上的用戶時，API開始變得困難。 平均而言， 每秒處理約170個請求 ，這對我來說似乎很少。

API的主要入口是：

import akka.actor.{Props, ActorSystem}

import akka.http.scaladsl.Http
import akka.http.scaladsl.model._
import akka.pattern.ask
import akka.http.scaladsl.server.Directives
import akka.http.scaladsl.unmarshalling.Unmarshaller
import akka.stream.ActorMaterializer
import com.typesafe.config.{Config, ConfigFactory}
import play.api.libs.json.{JsObject, Json}

import scala.concurrent.{Future, ExecutionContext}
import akka.http.scaladsl.server.Directives._
import akka.util.Timeout
import scala.concurrent.duration._
import ExecutionContext.Implicits.global

case class PostMsg(msg:JsObject)
case object PostSuccess
case class PostFailure(msg:String)

class Msgapi(conf:Config) {
  implicit val um:Unmarshaller[HttpEntity, JsObject] = {
    Unmarshaller.byteStringUnmarshaller.mapWithCharset { (data, charset) =>
      Json.parse(data.toArray).asInstanceOf[JsObject]
    }
  }
  implicit val system = ActorSystem("MsgApi")
  implicit val timeout = Timeout(5 seconds)
  implicit val materializer = ActorMaterializer()

  val router = system.actorOf(Props(new RouterActor(conf)))

  val route = {
    path("msg") {
      post {
        entity(as[JsObject]) {obj =>
          if(!obj.keys.contains("key1") || !obj.keys.contains("key2") || !obj.keys.contains("key3")){
            complete{
              HttpResponse(status=StatusCodes.BadRequest, entity="Invalid json provided. Required fields: key1, key2, key3 \n")
            }
          } else {
            onSuccess(router ? PostMsg(obj)){
              case PostSuccess => {
                complete{
                  Future{
                    HttpResponse(status = StatusCodes.OK, entity = "Post success")
                  }
                }
              }
              case PostFailure(msg) =>{
                complete{
                  Future{
                    HttpResponse(status = StatusCodes.InternalServerError, entity=msg)
                  }
                }
              }
              case _ => {
                complete{
                  Future{
                    HttpResponse(status = StatusCodes.InternalServerError, entity = "Unknown Server error occurred.")
                  }
                }
              }
            }
          }
        }
      }
    }
  }

  def run():Unit = {
    Http().bindAndHandle(route, interface = conf.getString("http.host"), port = conf.getInt("http.port"))
  }
}

object RunMsgapi {
  def main(Args: Array[String]):Unit = {
    val conf = ConfigFactory.load()
    val api = new Msgapi(conf)
    api.run()
  }
}

路由器角色如下：

import akka.actor.{ActorSystem, Props, Actor}
import akka.http.scaladsl.server.RequestContext
import akka.routing.{Router, SmallestMailboxRoutingLogic, ActorRefRoutee}
import com.typesafe.config.Config
import play.api.libs.json.JsObject

class RouterActor(conf:Config) extends Actor{

  val router = {
    val routees = Vector.tabulate(conf.getInt("kafka.producer-number"))(n => {
      val r = context.system.actorOf(Props(new KafkaProducerActor(conf, n )))
      ActorRefRoutee(r)
    })
    Router(SmallestMailboxRoutingLogic(), routees)
  }

  def receive = {
    case PostMsg(msg) => {
      router.route(PostMsg(msg), sender())
    }
  }
}

最后，卡夫卡制片人演員：

import akka.actor.Actor
import java.util.Properties
import com.typesafe.config.Config
import kafka.message.NoCompressionCodec
import kafka.utils.Logging
import org.apache.kafka.clients.producer._
import play.api.libs.json.JsObject
import scala.concurrent.duration._
import scala.concurrent.{ExecutionContext, Future, Await}
import ExecutionContext.Implicits.global

import scala.concurrent.{Future, Await}
import scala.util.{Failure, Success}

class KafkaProducerActor(conf:Config, id:Int) extends Actor with Logging {
  var topic: String = conf.getString("kafka.topic")
  val codec = NoCompressionCodec.codec

  val props = new Properties()
  props.put("bootstrap.servers", conf.getString("kafka.bootstrap-servers"))
  props.put("acks", conf.getString("kafka.acks"))
  props.put("retries", conf.getString("kafka.retries"))
  props.put("batch.size", conf.getString("kafka.batch-size"))
  props.put("linger.ms", conf.getString("kafka.linger-ms"))
  props.put("buffer.memory", conf.getString("kafka.buffer-memory"))
  props.put("key.serializer", conf.getString("kafka.key-serializer"))
  props.put("value.serializer", conf.getString("kafka.value-serializer"))

  val producer = new KafkaProducer[String, String](props)

  def receive = {
    case PostMsg(msg) => {
      // push the msg to Kafka
      try{
        val res = Future{
          producer.send(new ProducerRecord[String, String](topic, msg.toString()))
        }
        val result = Await.result(res, 1 second).get()
        sender ! PostSuccess
      } catch{
        case e: Exception => {
          println(e.printStackTrace())
          sender ! PostFailure("Kafka push error")
        }
      }
    }
  }
}

我的想法是，在application.conf中，我可以輕松指定應該有多少個生產者，從而實現更好的水平縮放。

但是，現在看來，api或路由器實際上是瓶頸。 作為測試，我禁用了Kafka產生的代碼，並將其替換為簡單的： sender ! PostSuccess sender ! PostSuccess 。 在加特林有3000個用戶，由於超時，我仍然有6％的請求失敗，這對我來說似乎是很長的時間。

我正在執行的加特林測試如下：

import io.gatling.core.Predef._ // 2
import io.gatling.http.Predef._
import scala.concurrent.duration._

class BasicSimulation extends Simulation { // 3
val httpConf = http // 4
    .baseURL("http://localhost:8080") // 5
    .acceptHeader("text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8") // 6
    .doNotTrackHeader("1")
    .acceptLanguageHeader("en-US,en;q=0.5")
    .acceptEncodingHeader("gzip, deflate")
    .userAgentHeader("Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0")
    .header("Content-Type", "application/json")

  val scn = scenario("MsgLoadTest")
    .repeat(100)(
      pace(2 seconds)
      .exec(http("request_1")
      .post("/msg").body(StringBody("""{ "key1":"something", "key2": "somethingElse", "key3":2222}""")).asJSON)
    )

  setUp( // 11
    scn.inject(rampUsers(3000) over (5 seconds)) // 12
  ).protocols(httpConf) // 13
}

更新

根據cmbaxter的一些指示，我嘗試了一些操作（請參閱注釋中的討論），並在加特林負載測試期間使用visualvm對應用程序進行了概要分析。 我不太了解如何解釋這些結果。 似乎在ThreadPoolExecutor中花費了大量時間，但這可能還可以嗎？ 分析的兩個屏幕截圖如下：

Answer 1

為了排除卡夫卡制片人，我從Actor中刪除了邏輯。 我仍然遇到性能問題。 因此，作為最終測試，我重新編寫了API，以便在POST出現時直接給出直接答案：

val route = {
    path("msg") {
      post {
        entity(as[String]) { obj =>
          complete(
            HttpResponse(status = StatusCodes.OK, entity = "OK")
          )
        }
      }
    }
  }

我在Spray中實施了相同的路線，以比較效果。 結果很清楚。 Akka HTTP（至少在當前的測試設置中）與Spray的性能不符。 也許可以對Akka HTTP進行一些調整？ 我在加特林附加了兩個針對3000個並發用戶的響應時間圖的屏幕截圖，提出了發布請求。

Akka HTTP

噴霧

Answer 2

我將完全消除KafkaProducerActor和路由器，並直接調用Scala包裝的producer.send版本。 為什么在不必要的情況下造成瓶頸？ 我完全可以想象全局執行上下文或參與者系統將成為您當前設置中的瓶頸。

這樣的事情應該可以解決問題：

class KafkaScalaProducer(val producer : KafkaProducer[String, String](props)) {
    def send(topic: String, msg : String) : Future[RecordMetadata] = {
        val promise = Promise[RecordMetadata]()
        try {
            producer.send(new ProducerRecord[String, String](topic, msg), new Callback {
                override def onCompletion(md : RecordMetadata, e : java.lang.Exception) {
                    if (md == null) promise.success(md)
                    else promise.failure(e)
                }
            })
        } catch {
            case e : BufferExhaustedException => promise.failure(e)
            case e : KafkaException => promise.failure(e)
        }
        promise.future
    }

    def close = producer.close
}

（注意：我實際上沒有嘗試過此代碼。應將其解釋為偽代碼）

然后，我將把將來的結果簡單地transform為HttpResponse 。

之后，這就是調整配置的問題。 您的瓶頸現在是Kafka Producer或Akka Http。

用於產生Kafka Performance的Akka HTTP REST API

問題描述

2 個解決方案

解決方案1
1 已采納 2016-02-05 12:20:11

解決方案2
0 2016-02-05 11:55:02

用於產生Kafka Performance的Akka HTTP REST API

問題描述

2 個解決方案

解決方案1 1 已采納 2016-02-05 12:20:11

解決方案2 0 2016-02-05 11:55:02

解決方案1
1 已采納 2016-02-05 12:20:11

解決方案2
0 2016-02-05 11:55:02