简体繁体中英

Spark submit job fails for cluster mode but works in local for copyToLocal from HDFS in java

原文 2018-06-16 21:34:25 4 1 java/ apache-spark/ hdfs/ spark-streaming/ spark-submit

I'm running a Java code to copy the files from HDFS to local using Spark cluster mode in spark submit. The job runs fine with spark local but fails in cluster mode. It throws a java.io.exeception: Target /mypath/ is a directory.

I don't understand why is it failing in cluster. But I don't recieve any exceptions in local.

1 answers

That behaviour is because in the first case (local) your driver is in the same machine that you are running the whole Spark job. In the second case (cluster), your driver program is shipped to one of your workers and execute the process from there.

In general, when you want to run Spark jobs as a cluster mode and you need to pre-process local files such as JSON, XML, among others, you need to ship them along with the executable using the following sentence --files <myfile> . Then in your driver program you will be able to see that particular file. If you want to include multiple files, put them separated by comma (,) .

The approach is the same when you want to add some jars dependencies, you need to use --jars <myJars> .

For more details about this, check this thread .

How to submit spark job from within java program to standalone spark cluster without using spark-submit?

Renaming a file on HDFS works in local mode but not in cluster mode

Spark submit on Yarn Cluster mode with config file put into HDFS issue

spark-submit on local microK8's Kubernetes cluster fails with: java.security.cert.CertPathValidatorException

Check if file exists on remote HDFS from local spark-submit

Submit Spark job in Azure Synapse from Java

Unable to submit a spark job on spark cluster on docker

Read CSV file in Spark kept in local using Java in cluster mode

Writing a sequence file from an image in local to HDFS using Java and Spark

Cannot connect to local spark cluster from Java application

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to submit spark job from within java program to standalone spark cluster without using spark-submit? Renaming a file on HDFS works in local mode but not in cluster mode Spark submit on Yarn Cluster mode with config file put into HDFS issue spark-submit on local microK8's Kubernetes cluster fails with: java.security.cert.CertPathValidatorException Check if file exists on remote HDFS from local spark-submit Submit Spark job in Azure Synapse from Java Unable to submit a spark job on spark cluster on docker Read CSV file in Spark kept in local using Java in cluster mode Writing a sequence file from an image in local to HDFS using Java and Spark Cannot connect to local spark cluster from Java application

Related Tags

Spark submit job fails for cluster mode but works in local for copyToLocal from HDFS in java

Question

1 answers

solution1 0 2018-06-17 00:12:52

solution1
0 2018-06-17 00:12:52