性能问题：Kafka + Storm +三叉戟+ OpaqueTridentKafkaSpout

Question

We are seeing some performance issues with Kafka + Storm + Trident + OpaqueTridentKafkaSpout 我们看到了Kafka + Storm + Trident + OpaqueTridentKafkaSpout的一些性能问题

Mentioned below are our setup details : 下面提到的是我们的设置详细信息：

Storm Topology : 风暴拓扑：

Broker broker = Broker.fromString("localhost:9092")
    GlobalPartitionInformation info = new GlobalPartitionInformation()
    if(args[4]){
        int partitionCount = args[4].toInteger()
        for(int i =0;i<partitionCount;i++){
            info.addPartition(i, broker)
        }
    }
    StaticHosts hosts = new StaticHosts(info)
    TridentKafkaConfig tridentKafkaConfig = new TridentKafkaConfig(hosts,"test")
    tridentKafkaConfig.scheme = new SchemeAsMultiScheme(new StringScheme())


    OpaqueTridentKafkaSpout kafkaSpout = new OpaqueTridentKafkaSpout(tridentKafkaConfig)
    TridentTopology topology = new TridentTopology()
    Stream st  = topology.newStream("spout1", kafkaSpout).parallelismHint(args[2].toInteger())
            .each(kafkaSpout.getOutputFields(), new NEO4JTridentFunction(), new Fields("status"))
            .parallelismHint(args[1].toInteger())
    Map conf = new HashMap()
    conf.put(Config.TOPOLOGY_WORKERS, args[3].toInteger())
    conf.put(Config.TOPOLOGY_DEBUG, false)

    if (args[0] == "local") {
        LocalCluster cluster = new LocalCluster()
        cluster.submitTopology("mytopology", conf, topology.build())
    } else {
        StormSubmitter.submitTopology("mytopology", conf, topology.build())
        NEO4JTridentFunction.getGraphDatabaseService().shutdown()
    }

Storm.yaml we are using for Storm is as below : 我们用于Storm的Storm.yaml如下：

########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
     - "localhost"
#     - "server2"
# 
storm.zookeeper.port : 2999


storm.local.dir: "/opt/mphrx/neo4j/stormdatadir"

nimbus.childopts: "-Xms2048m"
ui.childopts: "-Xms1024m"
logviewer.childopts: "-Xmx512m"
supervisor.childopts: "-Xms1024m"
worker.childopts: "-Xms2600m -Xss256k -XX:MaxPermSize=128m -XX:PermSize=96m
    -XX:NewSize=1000m -XX:MaxNewSize=1000m -XX:MaxTenuringThreshold=1 -XX:SurvivorRatio=6
    -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
    -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
    -server -XX:+AggressiveOpts -XX:+UseCompressedOops -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true
    -Xloggc:logs/gc-worker-%ID%.log -verbose:gc
    -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m
    -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram
    -XX:+PrintTenuringDistribution -XX:-PrintGCApplicationStoppedTime -XX:-PrintGCApplicationConcurrentTime
    -XX:+PrintCommandLineFlags -XX:+PrintFlagsFinal"

java.library.path: "/usr/lib/jvm/jdk1.7.0_25"

supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703

topology.trident.batch.emit.interval.millis: 100
topology.message.timeout.secs: 300
#topology.max.spout.pending: 10000

Size of each message produced in Kafka : 11 KB 在Kafka中产生的每个消息的大小：11 KB
Execution time of each bolt(NEO4JTridentFunction) to process the data : 500ms 每个螺栓（NEO4JTridentFunction）执行数据的执行时间：500ms
No. of Storm Workers : 1 风暴工人数量：1
Parallelism hint for Spout(OpaqueTridentKafkaSpout): 1 Spout（OpaqueTridentKafkaSpout）的并行提示：1
Parallelism hint for Bolt/Function(NEO4JTridentFunction) : 50 Bolt / Function（NEO4JTridentFunction）的并行提示：50
We are seeing throughput of around 12msgs/sec from Spout. 我们发现Spout的吞吐量约为12msgs / sec。
Rate of messages produced in Kafka : 150msgs/sec 卡夫卡产生的讯息速率：150msgs /秒

Both Storm and Kafka are a single node deployment. Storm和Kafka都是单节点部署。 We have read about much higher throughput from Storm but are unable to produce the same. 我们已经从Storm了解到更高的吞吐量，但是无法产生相同的吞吐量。 Please suggest how to tune the Storm+ Kafka + OpaqueTridentKafkaSpout configuration to achieve higher throughput. 请提出如何调整Storm + Kafka + OpaqueTridentKafkaSpout配置以提高吞吐量的建议。 Any help in this regard would help us immensely. 在这方面的任何帮助将极大地帮助我们。

Thanks, 谢谢，

Answer 1

You should set spout parallelism same as partition count for mentioned topics. 您应该为上述主题设置与分区计数相同的喷嘴并行性。 By default, trident accept one batch for each execution, you should increase this count by changing topology.max.spout.pending property. 默认情况下，三叉戟每次执行一次接受一批，您应该通过更改topology.max.spout.pending属性来增加此计数。 Since Trident forces ordered transaction management, your execution method (NEO4JTridentFunction)must be fast to reach desired solution. 由于Trident强制执行有序的事务管理，因此您的执行方法（NEO4JTridentFunction）必须快速达到所需的解决方案。

In addition,you can play with "tridentConfig.fetchSizeBytes" , by changing it, you can ingest more data for each new emit call in your spout. 另外，您可以使用"tridentConfig.fetchSizeBytes" ，通过更改它，可以为喷嘴中的每个新的emit调用摄取更多数据。

Also you must check your garbage collection log, it will give you clue about real point. 另外，您必须检查您的垃圾收集日志，它会为您提供有关实际意义的线索。

You can enable garbage collection log by adding "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:{path}/gc-storm-worker-%ID%.log" , in worker.childopts settings in your worker config. 您可以通过在worker.childopts中的设置中添加"-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:{path}/gc-storm-worker-%ID%.log"来启用垃圾收集日志。工作人员配置。

Last but not least, you can use G1GC, if your young generation ratio is higher than normal case. 最后但并非最不重要的一点是，如果您的年轻一代比率高于正常情况，则可以使用G1GC。

Answer 2

Please set your worker.childopts based on your system configuration. 请根据系统配置设置worker.childopts。 Use SpoutConfig.fetchSizeBytes to increase the number of bytes being pulled into the topology. 使用SpoutConfig.fetchSizeBytes可以增加被拉入拓扑的字节数。 Increase your Parallelism hint. 增加您的并行提示。

Answer 3

my calculations: if 8 Cores and 500MS per bolt -> ~16 Messages/sec. 我的计算：如果8个核心和每个螺栓500MS->〜16消息/秒。 if you optimize the bolt, then you will see improvements. 如果您优化螺栓，那么您将看到改进。

also, for CPU bound bolts, try Parallelism hint = 'amount of total cores' and increase topology.trident.batch.emit.interval.millis to the amount of time it takes to process entire batch divided by 2. set topology.max.spout.pending to 1. 同样，对于CPU约束螺栓，请尝试Parallelism hint ='总核数'，并将topology.trident.batch.emit.interval.millis增加到处理整个批处理所花费的时间除以2.设置topology.max。 spout.pending到1。

性能问题：Kafka + Storm +三叉戟+ OpaqueTridentKafkaSpout

问题描述

3 个解决方案

解决方案1
2 2016-03-17 14:55:00

解决方案2
0 2016-01-08 15:23:01

解决方案3
0 2016-03-19 18:25:31

性能问题：Kafka + Storm +三叉戟+ OpaqueTridentKafkaSpout

问题描述

3 个解决方案

解决方案1 2 2016-03-17 14:55:00

解决方案2 0 2016-01-08 15:23:01

解决方案3 0 2016-03-19 18:25:31

解决方案1
2 2016-03-17 14:55:00

解决方案2
0 2016-01-08 15:23:01

解决方案3
0 2016-03-19 18:25:31