简体   繁体   中英

Spark structured streaming job stuck for hours without getting killed

I have a structured streaming job which reads from kafka, perform aggregations and write to hdfs. The job is running in cluster mode in yarn. I am using spark2.4. Every 2-3 days this job gets stuck. It doesn't fail but gets stuck at some microbatch microbatch. The microbatch doesn't even tend to start. The driver keeps printing following log multiple times for hours.

 Got an error when resolving hostNames. Falling back to /default-rack for all.

When I kill the streaming job and start again, the job again starts running fine. How to fix this ?

See this issue https://issues.apache.org/jira/browse/SPARK-28005 This is fixed in spark 3.0. It seems that this happens because there are no active executers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM