简体   繁体   English

当您在 Kafka 消费者/客户端上收到“太多打开的文件”时,这意味着什么?

[英]What does it mean when you get "too many open files" on the Kafka Consumer/Client?

I know that this usually means the ulimit needs to be increased.我知道这通常意味着需要增加 ulimit。 But what does this actually mean when it happens on the consumer side?但是,当它发生在消费者方面时,这实际上意味着什么?

I'm using Apache Flink and I got this error on my Flink task node.我正在使用 Apache Flink,但在我的 Flink 任务节点上出现此错误。 When I reboot my Flink node and redeployed the job it worked fine.当我重新启动 Flink 节点并重新部署作业时,它运行良好。 The brokers also seemed fine at the time.经纪人当时看起来也不错。

I have a total of 9 tasks running over 3 nodes.我总共有 9 个任务在 3 个节点上运行。 Max parallelism for any one job is 1 to 2. So lets assume worst case 18 parallelism/threads over 3 nodes.任何一项作业的最大并行度为 1 到 2。因此,让我们假设最坏的情况是 3 个节点上有 18 个并行度/线程。

org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:799)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:650)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:630)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaPartitionDiscoverer.initializeConnections(KafkaPartitionDiscoverer.java:58)
at org.apache.flink.streaming.connectors.kafka.internals.AbstractPartitionDiscoverer.open(AbstractPartitionDiscoverer.java:94)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.open(FlinkKafkaConsumerBase.java:504)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:424)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:290)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.KafkaException: java.io.IOException: Too many open files
at org.apache.kafka.common.network.Selector.<init>(Selector.java:154)
at org.apache.kafka.common.network.Selector.<init>(Selector.java:188)
at org.apache.kafka.common.network.Selector.<init>(Selector.java:192)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:722)
... 11 more
Caused by: java.io.IOException: Too many open files
at sun.nio.ch.IOUtil.makePipe(Native Method)
at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:65)
at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
at java.nio.channels.Selector.open(Selector.java:227)
at org.apache.kafka.common.network.Selector.<init>(Selector.java:152)
... 14 more

Every Kafka client (producer, consumer) maintains a single socket per every broker in the cluster its connected to (worst case).每个 Kafka 客户端(生产者、消费者)为其连接的集群中的每个代理维护一个套接字(最坏情况)。

so youre looking at number of clients flink creates times number of brokers in your cluster所以你在看客户端的数量 flink 在你的集群中创建的代理数量

sockets count as handles for purposes of ulimit.出于 ulimit 的目的,套接字算作句柄。

I dont know how many kafka clients flink creates internally - you could grab a heap dump and see how many client objects are in there我不知道 flink 在内部创建了多少 kafka 客户端 - 您可以获取堆转储并查看其中有多少客户端对象

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM