簡體   English   中英

在Azure Data Factory中執行Scala jar文件

[英]execute scala jar file in azure data factory

這是我要執行的代碼:

SimpleApp.scala

package test

import java.sql.DriverManager
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf


object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("bouh").setMaster("yarn")
    val sc = new SparkContext(conf)


    val jdbcHostname = "servername.database.windows.net" 
    val jdbcPort = 1433
    val jdbcDatabase ="database"
    val jdbc_url = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=60;"

    val jdbcUsername = "user"
    val jdbcPassword = "password"

    val connection = DriverManager.getConnection(jdbc_url, jdbcUsername, jdbcPassword)
    val statement = connection.createStatement

    val rdd = sc.textFile("wasbs://dev@hdinsight.blob.core.windows.net/folder/*.txt")

    rdd.collect().map(
      (Id: String) => {
        statement.execute(s"EXEC delete_item_by_id @Id = '${Id}'")
      }
    )
  }
}

我使用智能IDEA(使用此鏈接: https ://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-create-standalone-application)進行了編譯。

現在,我試圖在Azure數據工廠上執行它。 我創造了工作:

{
    "name": "pipeline1",
    "properties": {
        "activities": [
            {
                "name": "Spark1",
                "type": "HDInsightSpark",
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false
                },
                "typeProperties": {
                    "rootPath": "dev/apps/spikes",
                    "entryFilePath": "test.jar",
                    "className": "SimpleApp",
                    "sparkJobLinkedService": {
                        "referenceName": "linkedServiceStorageBlobHDI",
                        "type": "LinkedServiceReference"
                    }
                },
                "linkedServiceName": {
                    "referenceName": "linkedServiceHDI",
                    "type": "LinkedServiceReference"
                }
            }
        ]
    }
}

但是執行失敗並顯示錯誤:

18/05/28 12:52:53 ERROR ApplicationMaster: Uncaught exception: 
java.lang.ClassNotFoundException: SimpleApp
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.apache.spark.deploy.yarn.ApplicationMaster.startUserApplication(ApplicationMaster.scala:621)
    at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:379)
    at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:245)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:749)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:71)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1865)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:70)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:747)
    at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/05/28 12:52:53 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.lang.ClassNotFoundException: SimpleApp)
18/05/28 12:52:53 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.lang.ClassNotFoundException: SimpleApp)
18/05/28 12:52:53 INFO ApplicationMaster: Deleting staging directory adl://home/user/livy/.sparkStaging/application_1527060048715_0507
18/05/28 12:52:53 INFO ShutdownHookManager: Shutdown hook called

我知道找不到該類,但是如何解決此問題? 是我的Scala腳本還是天青的工作?


編輯 :如果我打開test.jar,我有很多文件/文件夾。 我在/ test文件夾中找到了SimpleApp.class(test是我的包的名稱)。 我在ADF "className": "test.SimpleApp"上嘗試過,但是仍然出現相同的錯誤java.lang.ClassNotFoundException: test.SimpleApp

您可以嘗試打開罐子並查看SimpleApp類的路徑

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM