在现有EMR上运行Scala Spark作业

Question

I have Spark Job aggregationfinal_2.11-0.1 jar which I am running on my machine.the composition of it is as follows : 我的机器上运行的是Spark Job Aggregationfinal_2.11-0.1 jar，它的组成如下：

package deploy
    object FinalJob {
      def main(args: Array[String]): Unit = {
        val spark = SparkSession
          .builder()
          .appName(s"${this.getClass.getSimpleName}")
          .config("spark.sql.shuffle.partitions", "4")
          .getOrCreate()

    //continued code
    }
    }

When I am running this code in local mode, it is running fine but when I am deploying this on the EMR cluster with putting its jar in main node.It is giving error as : 当我在本地模式下运行此代码时，它运行良好，但是当我将其jar放在主节点上时将其部署在EMR群集上时，出现错误为：

ClassNotFoundException : deploy.FinalJob

What am i missing here? 我在这里想念什么？

Answer 1

The best option is to deploy your uber jar(you can use sbt assembly plugin to build jar) to s3 and add spark step to EMR cluster. 最好的选择是将uber jar（可以使用sbt assembly插件来构建jar）部署到s3并向EMR集群添加spark步骤。 Please check: http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-submit-step.html 请检查： http : //docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-submit-step.html

Answer 2

try to unjar it to some folder and look for target/classes using the below command jar -xvf myapp.jar. 尝试将其解压缩到某个文件夹，并使用以下命令jar -xvf myapp.jar查找目标/类。 If the target classes is not containing the class you are executing then there is an issue with the way you build your jar. 如果目标类不包含您正在执行的类，则构建jar的方式存在问题。 I would recommend maven assembly to be in your pom for packaging. 我建议将maven组装放入包装的pom中。

在现有EMR上运行Scala Spark作业

问题描述

2 个解决方案

解决方案1
0 2017-11-16 10:40:48

解决方案2
0 2017-11-16 21:03:33

在现有EMR上运行Scala Spark作业

问题描述

2 个解决方案

解决方案1 0 2017-11-16 10:40:48

解决方案2 0 2017-11-16 21:03:33

解决方案1
0 2017-11-16 10:40:48

解决方案2
0 2017-11-16 21:03:33