Spark client never ends running in yarn-cluster mode

Question

we are experiencing a weird problem with Spark 1.6.2. We're submitting out Spark applications in clsuter mode.Everything is fine, but sometimes the client process which launched the application happen to hang up. And the only way to unlock it is to inspect its sterr: then it finishes. I try to explain what I mean with an example.

We are on the edge node of our cluster and we run:

spark-submit --master yarn-cluster ... &

It turns out that the client process pid is 12435. Then, the Spark application runs and finishes (we can see it from yarn ot the Spark UI). Nonetheless, on the edge node the process 12435 stay alive and never ends. Then, we try to inspect its output from /proc/12435/fd/2. When we do that, the process ends.

I can't understand what is happening and how to fix it. Does anybody have an idea?

Thank you, Marco

Answer 1

This has got nothing to do with spark.

It is a shell issue. You are forgetting to redirect error log to any place.

There are two output streams of any command, stdout and stderr and you should provide both of them when starting a background job.

If you want to redirect both output to same file.

spark-submit --master yarn-cluster ...  > ~/output.txt 2>&1 &

If you want error in one and output log in other

spark-submit --master yarn-cluster ...  > ~/output.txt 2>~/error.txt &

Spark client never ends running in yarn-cluster mode

Question

1 answers

solution1
0 2016-12-09 17:33:55

Spark client never ends running in yarn-cluster mode

Question

1 answers

solution1 0 2016-12-09 17:33:55

solution1
0 2016-12-09 17:33:55