Kafka + Spark Streaming-分区之间的公平性吗？

Question

I have a 20 partition topic in Kafka and am reading it with Spark Streaming (8 executors, 3 cores each). 我在Kafka中有20个分区主题，并且正在使用Spark Streaming（8个执行程序，每个3个内核）进行阅读。 I'm using the direct stream method of reading. 我正在使用直接流式阅读方法。

I'm having problems because the first 12 partitions are getting read at a faster rate than the last 8 for some reason. 我遇到了问题，因为某些原因，前12个分区的读取速度比后8个分区的读取速度快。 So, data in the last 8 is getting stale (well, staler). 因此，过去8个中的数据已经过时了。

Partitions 12-19 are around 90% caught up to partitions 0-11; 分区12-19约占分区0-11的90％； but we're talking about billions of messages; 但是我们正在谈论数十亿条消息； so the stale-ness of the data 10% back in the topic partition is pretty significant. 因此，回到主题分区的10％的数据的陈旧性非常重要。

Is this normal? 这正常吗？ Can I make sure Kafka consumes the partitions more fairly? 我可以确保Kafka更公平地使用分区吗？

Answer 1

Kafka consumer consumption divides partitions over consumer instances within a consumer group. Kafka消费者消费将分区划分为消费者组中的消费者实例。 Each consumer in the consumer group is an exclusive consumer of a “fair share” of partitions. 消费者组中的每个消费者都是“公平共享”分区的排他性消费者。 This is how Kafka does load balancing of consumers in a consumer group. 这是Kafka进行消费者组中消费者负载平衡的方式。 Consumer membership within a consumer group is handled by the Kafka protocol dynamically. 消费者组中的消费者成员资格由Kafka协议动态处理。 If new consumers join a consumer group, it gets a share of partitions. 如果新的消费者加入一个消费者组，它将获得一定份额的分区。 If a consumer dies, its partitions are split among the remaining live consumers in the consumer group. 如果某个消费者死亡，则其分区将在该消费者组中其余的活动消费者之间划分。 This is how Kafka does fail over of consumers in a consumer group. 这就是Kafka确实对消费者组中的消费者进行故障转移的方式。

UnderReplicatedPartitions: UnderReplicatedPartitions：

In a healthy cluster, the number of in sync replicas (ISRs) should be exactly equal to the total number of replicas. 在运行状况良好的群集中，同步副本（ISR）的数量应完全等于副本的总数。 If partition replicas fall too far behind their leaders, the follower partition is removed from the ISR pool, and you should see a corresponding increase in IsrShrinksPerSec. 如果分区副本落后于其领导者太远，则从ISR池中删除了跟随者分区，您应该看到IsrShrinksPerSec相应增加。 Since Kafka's high-availability guarantees cannot be met without replication, investigation is certainly warranted should this metric value exceed zero for extended time periods. 由于无法复制就无法满足Kafka的高可用性保证，因此，如果该度量标准值在较长时间内超过零，则肯定需要进行调查。

IsrShrinksPerSec/IsrExpandsPerSec: IsrShrinksPerSec / IsrExpandsPerSec：

The number of in-sync replicas (ISRs) for a particular partition should remain fairly static, the only exceptions are when you are expanding your broker cluster or removing partitions. 特定分区的同步副本（ISR）的数量应保持相当的静态，唯一的例外是在扩展代理群集或删除分区时。 In order to maintain high availability, a healthy Kafka cluster requires a minimum number of ISRs for failover. 为了维持高可用性，运行状况良好的Kafka群集需要最少数量的ISR进行故障转移。 A replica could be removed from the ISR pool for a couple of reasons: it is too far behind the leader's offset (user-configurable by setting the replica.lag.max.messages configuration parameter), or it has not contacted the leader for some time (configurable with the replica.socket.timeout.ms parameter). 出于以下两个原因，可以将副本从ISR池中删除：副本距离领导者的偏移量太远（用户可以通过设置copy.lag.max.messages配置参数进行配置），或者由于某些原因未与领导者联系时间（可通过copy.socket.timeout.ms参数配置）。 No matter the reason, an increase in IsrShrinksPerSec without a corresponding increase in IsrExpandsPerSec shortly thereafter is cause for concern and requires user intervention.The Kafka documentation provides a wealth of information on the user-configurable parameters for brokers. 无论出于什么原因，IsrShrinksPerSec的增加而之后不久又没有相应增加IsrExpandsPerSec的情况令人担忧，需要用户干预。Kafka文档为经纪人提供了大量有关用户可配置参数的信息。

https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/ https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/

Under fair sharing, Spark assigns tasks between jobs in a “round robin” fashion, so that all jobs get a roughly equal share of cluster resources. 在公平共享下，Spark以“循环”方式在作业之间分配任务，以便所有作业都获得大致相等的群集资源份额。 This means that short jobs submitted while a long job is running can start receiving resources right away and still get good response times, without waiting for the long job to finish. 这意味着在运行长作业时提交的短作业可以立即开始接收资源，并且仍然获得良好的响应时间，而无需等待长作业完成。

By default, Spark's scheduler runs jobs in FIFO fashion and the first job gets priority on all available resources while its stages have tasks to launch, then the second job gets priority, etc. 默认情况下，Spark的调度程序以FIFO方式运行作业，第一个作业在所有可用资源上具有优先级，而其阶段有要启动的任务，然后第二个作业具有优先级，依此类推。

Spark Streaming you can configure fairscheduling mode and Spark Streaming's JobScheduler should submit Spark jobs per topic in parallel 您可以配置Spark Streaming安排公平调度模式，Spark Streaming的JobScheduler应按主题并行提交Spark作业

To enable the fair scheduler, simply set the spark.scheduler.mode property to FAIR when configuring a SparkContext: 要启用公平调度程序，只需在配置SparkContext时将spark.scheduler.mode属性设置为FAIR即可：

val conf = new SparkConf().setMaster(...).setAppName(...)
conf.set("spark.scheduler.mode", "FAIR") val sc = new
SparkContext(conf)

Answer 2

In my particular case, it turns out that I'm hitting some sort of bug (possibly in MapR's distribution). 在我的特定情况下，事实证明我遇到了某种错误（可能在MapR的发行版中）。

The bug causes the offsets of certain partitions to reset to 0 which, when observed later, causes them to just look incrementally a little behind. 该错误导致某些分区的偏移量重置为0，这在以后观察时会导致它们的偏移量看起来稍微落后一点。

I found configuration parameters which mitigate the issue, and a much larger discussion on the topic is available here: https://community.mapr.com/thread/22319-spark-streaming-mapr-streams-losing-partitions 我发现可以缓解此问题的配置参数，有关此主题的更多讨论可以在这里找到： https : //community.mapr.com/thread/22319-spark-streaming-mapr-streams-losing-partitions

Configuration Example - On Spark Context 配置示例-关于Spark上下文

 .set("spark.streaming.kafka.consumer.poll.ms", String.valueOf(Config.config.AGG_KAFKA_POLLING_MS))
 .set("spark.streaming.kafka.maxRetries", String.valueOf(10))

Edit 编辑

Confirmed that other people have had this issue as well with Spark Streaming + MapR-Streams/Kafka - this configuration seemed to lessen the chance of it happening but it did eventually come back. 确认其他人在Spark Streaming + MapR-Streams / Kafka上也遇到了这个问题-这种配置似乎减少了发生这种情况的机会，但最终确实回来了。

You can work around it with a safety check that detects the condition and "fixes" the offset using a standard Kafka consumer prior to starting your spark stream (the problem occurs when restarting the streaming app); 您可以通过安全检查来解决该问题，该安全检查可以在启动火花流之前使用标准的Kafka使用方检测条件并“修复”偏移量（问题在重新启动流式应用程序时发生）； but you have to store the offsets externally to do this. 但是您必须在外部存储偏移量才能执行此操作。 Compounding this problem, you can't reliably provide offsets to Spark 2.1.0 streaming on start-up due to another bug; 使这个问题更加复杂的是，由于另一个错误，您无法在启动时可靠地为Spark 2.1.0流提供偏移量。 this is why you must manipulate the offsets with a consumer prior to starting the streaming; 这就是为什么在开始流传输之前必须与使用者一起处理偏移的原因； that way it is starting from offsets already stored in Kafka. 这样，它就从已经存储在Kafka中的偏移量开始。

Kafka + Spark Streaming-分区之间的公平性吗？

问题描述

2 个解决方案

解决方案1
2 2017-10-06 15:37:51

解决方案2
0 已采纳 2017-10-31 18:11:07

Kafka + Spark Streaming-分区之间的公平性吗？

问题描述

2 个解决方案

解决方案1 2 2017-10-06 15:37:51

解决方案2 0 已采纳 2017-10-31 18:11:07

解决方案1
2 2017-10-06 15:37:51

解决方案2
0 已采纳 2017-10-31 18:11:07