简体   繁体   English

Spark Kafka Producer 抛出太多打开的文件异常

[英]Spark Kafka Producer throwing Too many open files Exception

I am trying to run a Spark Kafka Job written in Java to produce around 10K records per batch to a Kafka Topic.我正在尝试运行用 Java 编写的 Spark Kafka 作业,以每批次为 Kafka 主题生成大约 10K 条记录。 This is a spark batch job which reads 100(total 1million records) hdfs part files sequentially in a loop and produce each part file of 10K records in a batch.这是一个 Spark 批处理作业,它在循环中顺序读取 100 个(总共 100 万条记录)hdfs 零件文件,并批量生成每个零件文件的 10K 条记录。 I am using org.apache.kafka.clients.producer.KafkaProducer API我正在使用 org.apache.kafka.clients.producer.KafkaProducer API

Getting below exception:得到以下异常:

org.apache.kafka.common.KafkaException: Failed to construct kafka producer
....
Caused by: org.apache.kafka.common.KafkaException: java.io.IOException: Too many open files
....
Caused by: java.io.IOException: Too many open files

Below is the configurations:下面是配置:

Cluster Resource availability:
---------------------------------
The cluster has more than 500 nodes, 150 Terabyte total memory, more than 30K cores

Spark Application configuration:
------------------------------------
Driver_memory: 24GB
--executor-cores: 5
--num-executors: 24
--executor-memory: 24GB

Topic Configuration:
--------------------
Partitions: 16
Replication: 3

Data size
----------
Each part file has 10K records
Total records 1million
Each batch produce 10K records

Please suggest some solutions for this as this is a very critical issue.请为此提出一些解决方案,因为这是一个非常关键的问题。

Thanks in advance提前致谢

In Kafka, every topic is (optionally) split into many partitions.在 Kafka 中,每个主题都(可选地)分成许多分区。 For each partition some files are maintained by brokers (for index and actual data).对于每个分区,一些文件由代理维护(用于索引和实际数据)。

kafka-topics --zookeeper localhost:2181 --describe --topic topic_name

will give you the number of partitions for topic topic_name .将为您提供主题topic_name的分区数。 The default number of partitions per topic num.partitions is defined under /etc/kafka/server.properties每个主题的默认分区数num.partitions/etc/kafka/server.properties下定义

The total number of open files could be huge if the broker hosts many partitions and a particular partition has many log segment files.如果代理托管许多分区并且特定分区具有许多日志段文件,则打开文件的总数可能会很大。

You can see the current file descriptor limit by running您可以通过运行查看当前文件描述符限制

ulimit -n

You can also check the number of open files using lsof :您还可以使用lsof检查打开文件的数量:

lsof | wc -l

To solve the issue you either need to change the limit of open file descriptors:要解决此问题,您需要更改打开文件描述符的限制:

ulimit -n <noOfFiles>

or somehow reduce the number of open files (for example, reduce number of partitions per topic).或以某种方式减少打开文件的数量(例如,减少每个主题的分区数量)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM