简体   繁体   中英

How to debug a Spark job on Dataproc?

I have a Spark job running on a Dataproc cluster. How do I configure the environment to debug it on my local machine with my IDE?

This tutorial assumes the following:

  • You know how to create GCP Dataproc clusters, either by API calls, cloud shell commands or Web UI
  • You know how to submit a Spark Job
  • You have permissions to launch jobs, create clusters and use Compute Engine instances

After some attempts, I've discovered how to debug on your local machine a DataProc Spark Job running on a cluster.

As you may know, you can submit a Spark Job either by using the Web UI, sending a request to the DataProc API or using the gcloud dataproc jobs submit spark command. Whichever way, you start by adding the following key-value pair to the properties field in the SparkJob : spark.driver.extraJavaOptions=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=REMOTE_PORT , where REMOTE_PORT is the port on the worker where the driver will be listening.

Chances are your cluster is on a private network and you need to create a SSH tunnel to the REMOTE_PORT. If that's not the case, you're lucky and you just need to connect to the worker using the public IP and the specified REMOTE_PORT on your IDE.

Using IntelliJ it would be like this:

在公共 IP 集群上调试 ,

where worker-ip is the worker which is listening (I've used 9094 as port this time). After a few attempts, I realized it's always the worker number 0, but you can connect to it and check whether there is a process running using netstat -tulnp | grep REMOTE_PORTnetstat -tulnp | grep REMOTE_PORT

If for whatever reason your cluster does not have a public IP, you need to set a SSH tunnel from your local machine to the worker. After specifying your ZONE and PROJECT you create a tunnel to REMOTE_PORT:

gcloud compute ssh CLUSTER_NAME-w-0  --project=$PROJECT --zone=$ZONE  --  -4 -N  -L LOCAL_PORT:CLUSTER_NAME-w-0:REMOTE_PORT

And you set your debug configuration on your IDE pointing to host=localhost/127.0.0.1 and port=LOCAL_PORT

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM