簡體   English   中英

數據處理集群中的 Spark 提交生成 java.lang.ClassNotFoundException:

[英]Spark-submit in dataproc cluster generating java.lang.ClassNotFoundException:

spark-submit 在本地集群上運行良好,沒有任何問題。 由於資源限制,我轉向了基於雲的計算。 目前,我在 Google Cloud Dataproc 中運行一個 spark-cluster,我有 1 個 master 和 4 個 worker。 當我提交作業時,出現以下錯誤:

  1. 我的提交命令:
spark-submit --master yarn --deploy-mode cluster --class com.aavash.ann.sparkann.GraphNetworkSCL cleanSCL2.jar Oldenburg_Nodes.txt Oldenburg_Edges.txt Oldenburg_part_4.txt
  1. YARN 日志文件中的更多詳細信息:
2022-09-28 05:06:00,129 INFO client.RMProxy: Connecting to ResourceManager at spark-cluster-m/10.146.0.5:8032
2022-09-28 05:06:00,769 INFO client.AHSProxy: Connecting to Application History server at spark-cluster-m/10.146.0.5:10200
2022-09-28 05:06:04,078 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

End of LogType:prelaunch.err
******************************************************************************

Container: container_1664339426994_0003_02_000001 on spark-cluster-w-0.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:stderr
LogLastModifiedTime:Wed Sep 28 04:52:32 +0000 2022
LogLength:1179
LogContents:
22/09/28 04:52:32 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: Uncaught exception: 
java.lang.ClassNotFoundException: com.aavash.ann.sparkann.GraphNetworkSCL
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at org.apache.spark.deploy.yarn.ApplicationMaster.startUserApplication(ApplicationMaster.scala:722)
        at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:496)
        at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:268)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:899)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:898)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
        at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:898)
        at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

End of LogType:stderr
***********************************************************************

Container: container_1664339426994_0003_02_000001 on spark-cluster-w-0.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:directory.info
LogLastModifiedTime:Wed Sep 28 04:52:30 +0000 2022
LogLength:5110
LogContents:
ls -l:
total 32
lrwxrwxrwx 1 yarn yarn   77 Sep 28 04:52 __app__.jar -> /hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/15/cleanSCL2.jar
lrwxrwxrwx 1 yarn yarn   82 Sep 28 04:52 __spark_conf__ -> /hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/14/__spark_conf__.zip
-rw-r--r-- 1 yarn yarn   69 Sep 28 04:52 container_tokens
-rwx------ 1 yarn yarn  717 Sep 28 04:52 default_container_executor.sh
-rwx------ 1 yarn yarn  662 Sep 28 04:52 default_container_executor_session.sh
-rwx------ 1 yarn yarn 5140 Sep 28 04:52 launch_container.sh
drwx--x--- 2 yarn yarn 4096 Sep 28 04:52 tmp
find -L . -maxdepth 5 -ls:
  4128911      4 drwx--x---   3 yarn     yarn         4096 Sep 28 04:52 .
  4128919      4 -rwx------   1 yarn     yarn          717 Sep 28 04:52 ./default_container_executor.sh
  4128920      4 -rw-r--r--   1 yarn     yarn           16 Sep 28 04:52 ./.default_container_executor.sh.crc
  4128918      4 -rw-r--r--   1 yarn     yarn           16 Sep 28 04:52 ./.default_container_executor_session.sh.crc
  4128916      4 -rw-r--r--   1 yarn     yarn           52 Sep 28 04:52 ./.launch_container.sh.crc
  4128915      8 -rwx------   1 yarn     yarn         5140 Sep 28 04:52 ./launch_container.sh
  4128914      4 -rw-r--r--   1 yarn     yarn           12 Sep 28 04:52 ./.container_tokens.crc
  4128913      4 -rw-r--r--   1 yarn     yarn           69 Sep 28 04:52 ./container_tokens
  4128903    292 -r-x------   1 yarn     yarn       297978 Sep 28 04:52 ./__app__.jar
  4128873      4 drwx------   3 yarn     yarn         4096 Sep 28 04:52 ./__spark_conf__
  4128899    148 -r-x------   1 yarn     yarn       150701 Sep 28 04:52 ./__spark_conf__/__spark_hadoop_conf__.xml
  4128901      4 -r-x------   1 yarn     yarn          470 Sep 28 04:52 ./__spark_conf__/__spark_dist_cache__.properties
  4128876      4 -r-x------   1 yarn     yarn          704 Sep 28 04:52 ./__spark_conf__/metrics.properties
  4128877      4 drwx------   2 yarn     yarn         4096 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__
  4128892      4 -r-x------   1 yarn     yarn         2163 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/mapred-env.sh
  4128896      4 -r-x------   1 yarn     yarn          977 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/fairscheduler.xml
  4128894      4 -r-x------   1 yarn     yarn         1535 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/distcp-default.xml
  4128890      8 -r-x------   1 yarn     yarn         7522 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/yarn-env.sh
  4128878     20 -r-x------   1 yarn     yarn        17233 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hadoop-env.sh
  4128895     12 -r-x------   1 yarn     yarn        11392 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hadoop-policy.xml
  4128888      4 -r-x------   1 yarn     yarn         1335 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/configuration.xsl
  4128887      0 -r-x------   1 yarn     yarn            0 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/nodes_exclude
  4128893      4 -r-x------   1 yarn     yarn         2316 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/ssl-client.xml.example
  4128891      4 -r-x------   1 yarn     yarn         1940 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/container-executor.cfg
  4128879     12 -r-x------   1 yarn     yarn         8338 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/mapred-site.xml
  4128881      4 -r-x------   1 yarn     yarn         3321 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hadoop-metrics2.properties
  4128880     16 -r-x------   1 yarn     yarn        14772 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/log4j.properties
  4128884      8 -r-x------   1 yarn     yarn         4131 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/core-site.xml
  4128882      4 -r-x------   1 yarn     yarn           82 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/yarn-timelineserver.logging.properties
  4128897      4 -r-x------   1 yarn     yarn         2697 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/ssl-server.xml.example
  4128886      0 -r-x------   1 yarn     yarn            0 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/nodes_include
  4128889      8 -r-x------   1 yarn     yarn         7052 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hdfs-site.xml
  4128898      8 -r-x------   1 yarn     yarn         4113 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/mapred-queues.xml.template
  4128883     12 -r-x------   1 yarn     yarn         8291 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/yarn-site.xml
  4128885     12 -r-x------   1 yarn     yarn         9533 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/capacity-scheduler.xml
  4128875      4 -r-x------   1 yarn     yarn         1225 Sep 28 04:52 ./__spark_conf__/log4j.properties
  4128900      4 -r-x------   1 yarn     yarn         1530 Sep 28 04:52 ./__spark_conf__/__spark_conf__.properties
  4128917      4 -rwx------   1 yarn     yarn          662 Sep 28 04:52 ./default_container_executor_session.sh
  4128912      4 drwx--x---   2 yarn     yarn         4096 Sep 28 04:52 ./tmp
broken symlinks(find -L . -maxdepth 5 -type l -ls):

End of LogType:directory.info
*******************************************************************************


End of LogType:stdout
***********************************************************************

Container: container_1664339426994_0003_02_000001 on spark-cluster-w-0.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:launch_container.sh
LogLastModifiedTime:Wed Sep 28 04:52:30 +0000 2022
LogLength:5140
LogContents:
#!/bin/bash

set -o pipefail -e
export PRELAUNCH_OUT="/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/prelaunch.out"
exec >"${PRELAUNCH_OUT}"
export PRELAUNCH_ERR="/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/prelaunch.err"
exec 2>"${PRELAUNCH_ERR}"
echo "Setting up env variables"
export PATH=${PATH:-"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin"}
export JAVA_HOME=${JAVA_HOME:-"/usr/lib/jvm/temurin-8-jdk-amd64"}
export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/usr/lib/hadoop"}
export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/usr/lib/hadoop-hdfs"}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop/conf"}
export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/usr/lib/hadoop-yarn"}
export HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-"/usr/lib/hadoop-mapreduce"}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-":/usr/lib/hadoop/lib/native"}
export HADOOP_TOKEN_FILE_LOCATION="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/appcache/application_1664339426994_0003/container_1664339426994_0003_02_000001/container_tokens"
export CONTAINER_ID="container_1664339426994_0003_02_000001"
export NM_PORT="8026"
export NM_HOST="spark-cluster-w-0.c.apache-spark-project-363713.internal"
export NM_HTTP_PORT="8042"
export LOCAL_DIRS="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/appcache/application_1664339426994_0003"
export LOCAL_USER_DIRS="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/"
export LOG_DIRS="/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001"
export USER="aavashbhandari"
export LOGNAME="aavashbhandari"
export HOME="/home/"
export PWD="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/appcache/application_1664339426994_0003/container_1664339426994_0003_02_000001"
export LOCALIZATION_COUNTERS="563687,0,2,0,125"
export JVM_PID="$$"
export NM_AUX_SERVICE_spark_shuffle=""
export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
export SPARK_YARN_STAGING_DIR="hdfs://spark-cluster-m/user/aavashbhandari/.sparkStaging/application_1664339426994_0003"
export APP_SUBMIT_TIME_ENV="1664340741193"
export PYSPARK_PYTHON="/opt/conda/default/bin/python"
export PYTHONHASHSEED="0"
export APPLICATION_WEB_PROXY_BASE="/proxy/application_1664339426994_0003"
export SPARK_DIST_CLASSPATH=":/etc/hive/conf:/usr/local/share/google/dataproc/lib/*:/usr/share/java/mysql.jar"
export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:/usr/lib/spark/jars/*::/etc/hive/conf:/usr/local/share/google/dataproc/lib/*:/usr/share/java/mysql.jar:$PWD/__spark_conf__/__hadoop_conf__"
export SPARK_USER="aavashbhandari"
export MALLOC_ARENA_MAX="4"
echo "Setting up job resources"
ln -sf -- "/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/14/__spark_conf__.zip" "__spark_conf__"
ln -sf -- "/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/15/cleanSCL2.jar" "__app__.jar"
echo "Copying debugging information"
# Creating copy of launch script
cp "launch_container.sh" "/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/launch_container.sh"
chmod 640 "/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/launch_container.sh"
# Determining directory contents
echo "ls -l:" 1>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/directory.info"
ls -l 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/directory.info"
echo "find -L . -maxdepth 5 -ls:" 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/directory.info"
find -L . -maxdepth 5 -ls 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/directory.info"
echo "broken symlinks(find -L . -maxdepth 5 -type l -ls):" 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/directory.info"
find -L . -maxdepth 5 -type l -ls 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/directory.info"
echo "Launching container"
exec /bin/bash -c "$JAVA_HOME/bin/java -server -Xmx2048m -Djava.io.tmpdir=$PWD/tmp -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.aavash.ann.sparkann.GraphNetworkSCL' --jar file:/home/aavashbhandari/cleanSCL2.jar --arg 'Oldenburg_Nodes.txt' --arg 'Oldenburg_Edges.txt' --arg 'Oldenburg_part_4.txt' --properties-file $PWD/__spark_conf__/__spark_conf__.properties --dist-cache-conf $PWD/__spark_conf__/__spark_dist_cache__.properties 1> /var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/stdout 2> /var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_02_000001/stderr"

End of LogType:launch_container.sh
************************************************************************************

Container: container_1664339426994_0003_02_000001 on spark-cluster-w-0.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:prelaunch.out
LogLastModifiedTime:Wed Sep 28 04:52:30 +0000 2022
LogLength:100
LogContents:
Setting up env variables
Setting up job resources
Copying debugging information
Launching container

End of LogType:prelaunch.out
******************************************************************************


End of LogType:stdout
***********************************************************************

Container: container_1664339426994_0003_01_000001 on spark-cluster-w-1.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:prelaunch.out
LogLastModifiedTime:Wed Sep 28 04:52:25 +0000 2022
LogLength:100
LogContents:
Setting up env variables
Setting up job resources
Copying debugging information
Launching container

End of LogType:prelaunch.out
******************************************************************************

Container: container_1664339426994_0003_01_000001 on spark-cluster-w-1.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:stderr
LogLastModifiedTime:Wed Sep 28 04:52:29 +0000 2022
LogLength:1179
LogContents:
22/09/28 04:52:29 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: Uncaught exception: 
java.lang.ClassNotFoundException: com.aavash.ann.sparkann.GraphNetworkSCL
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at org.apache.spark.deploy.yarn.ApplicationMaster.startUserApplication(ApplicationMaster.scala:722)
        at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:496)
        at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:268)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:899)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:898)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
        at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:898)
        at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

End of LogType:stderr
***********************************************************************


End of LogType:prelaunch.err
******************************************************************************

Container: container_1664339426994_0003_01_000001 on spark-cluster-w-1.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:directory.info
LogLastModifiedTime:Wed Sep 28 04:52:25 +0000 2022
LogLength:5110
LogContents:
ls -l:
total 32
lrwxrwxrwx 1 yarn yarn   77 Sep 28 04:52 __app__.jar -> /hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/11/cleanSCL2.jar
lrwxrwxrwx 1 yarn yarn   82 Sep 28 04:52 __spark_conf__ -> /hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/10/__spark_conf__.zip
-rw-r--r-- 1 yarn yarn   69 Sep 28 04:52 container_tokens
-rwx------ 1 yarn yarn  717 Sep 28 04:52 default_container_executor.sh
-rwx------ 1 yarn yarn  662 Sep 28 04:52 default_container_executor_session.sh
-rwx------ 1 yarn yarn 5141 Sep 28 04:52 launch_container.sh
drwx--x--- 2 yarn yarn 4096 Sep 28 04:52 tmp
find -L . -maxdepth 5 -ls:
  4128841      4 drwx--x---   3 yarn     yarn         4096 Sep 28 04:52 .
  4128849      4 -rwx------   1 yarn     yarn          717 Sep 28 04:52 ./default_container_executor.sh
  4128850      4 -rw-r--r--   1 yarn     yarn           16 Sep 28 04:52 ./.default_container_executor.sh.crc
  4128848      4 -rw-r--r--   1 yarn     yarn           16 Sep 28 04:52 ./.default_container_executor_session.sh.crc
  4128846      4 -rw-r--r--   1 yarn     yarn           52 Sep 28 04:52 ./.launch_container.sh.crc
  4128845      8 -rwx------   1 yarn     yarn         5141 Sep 28 04:52 ./launch_container.sh
  4128844      4 -rw-r--r--   1 yarn     yarn           12 Sep 28 04:52 ./.container_tokens.crc
  4128843      4 -rw-r--r--   1 yarn     yarn           69 Sep 28 04:52 ./container_tokens
  4128833    292 -r-x------   1 yarn     yarn       297978 Sep 28 04:52 ./__app__.jar
  4128803      4 drwx------   3 yarn     yarn         4096 Sep 28 04:52 ./__spark_conf__
  4128829    148 -r-x------   1 yarn     yarn       150701 Sep 28 04:52 ./__spark_conf__/__spark_hadoop_conf__.xml
  4128831      4 -r-x------   1 yarn     yarn          470 Sep 28 04:52 ./__spark_conf__/__spark_dist_cache__.properties
  4128806      4 -r-x------   1 yarn     yarn          704 Sep 28 04:52 ./__spark_conf__/metrics.properties
  4128807      4 drwx------   2 yarn     yarn         4096 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__
  4128822      4 -r-x------   1 yarn     yarn         2163 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/mapred-env.sh
  4128826      4 -r-x------   1 yarn     yarn          977 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/fairscheduler.xml
  4128824      4 -r-x------   1 yarn     yarn         1535 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/distcp-default.xml
  4128820      8 -r-x------   1 yarn     yarn         7522 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/yarn-env.sh
  4128808     20 -r-x------   1 yarn     yarn        17233 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hadoop-env.sh
  4128825     12 -r-x------   1 yarn     yarn        11392 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hadoop-policy.xml
  4128818      4 -r-x------   1 yarn     yarn         1335 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/configuration.xsl
  4128817      0 -r-x------   1 yarn     yarn            0 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/nodes_exclude
  4128823      4 -r-x------   1 yarn     yarn         2316 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/ssl-client.xml.example
  4128821      4 -r-x------   1 yarn     yarn         1940 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/container-executor.cfg
  4128809     12 -r-x------   1 yarn     yarn         8338 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/mapred-site.xml
  4128811      4 -r-x------   1 yarn     yarn         3321 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hadoop-metrics2.properties
  4128810     16 -r-x------   1 yarn     yarn        14772 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/log4j.properties
  4128814      8 -r-x------   1 yarn     yarn         4131 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/core-site.xml
  4128812      4 -r-x------   1 yarn     yarn           82 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/yarn-timelineserver.logging.properties
  4128827      4 -r-x------   1 yarn     yarn         2697 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/ssl-server.xml.example
  4128816      0 -r-x------   1 yarn     yarn            0 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/nodes_include
  4128819      8 -r-x------   1 yarn     yarn         7052 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/hdfs-site.xml
  4128828      8 -r-x------   1 yarn     yarn         4113 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/mapred-queues.xml.template
  4128813     12 -r-x------   1 yarn     yarn         8291 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/yarn-site.xml
  4128815     12 -r-x------   1 yarn     yarn         9533 Sep 28 04:52 ./__spark_conf__/__hadoop_conf__/capacity-scheduler.xml
  4128805      4 -r-x------   1 yarn     yarn         1225 Sep 28 04:52 ./__spark_conf__/log4j.properties
  4128830      4 -r-x------   1 yarn     yarn         1530 Sep 28 04:52 ./__spark_conf__/__spark_conf__.properties
  4128847      4 -rwx------   1 yarn     yarn          662 Sep 28 04:52 ./default_container_executor_session.sh
  4128842      4 drwx--x---   2 yarn     yarn         4096 Sep 28 04:52 ./tmp
broken symlinks(find -L . -maxdepth 5 -type l -ls):

End of LogType:directory.info
*******************************************************************************

Container: container_1664339426994_0003_01_000001 on spark-cluster-w-1.c.apache-spark-project-363713.internal_8026
LogAggregationType: AGGREGATED
==================================================================================================================
LogType:launch_container.sh
LogLastModifiedTime:Wed Sep 28 04:52:25 +0000 2022
LogLength:5141
LogContents:
#!/bin/bash

set -o pipefail -e
export PRELAUNCH_OUT="/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/prelaunch.out"
exec >"${PRELAUNCH_OUT}"
export PRELAUNCH_ERR="/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/prelaunch.err"
exec 2>"${PRELAUNCH_ERR}"
echo "Setting up env variables"
export PATH=${PATH:-"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin"}
export JAVA_HOME=${JAVA_HOME:-"/usr/lib/jvm/temurin-8-jdk-amd64"}
export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/usr/lib/hadoop"}
export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/usr/lib/hadoop-hdfs"}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop/conf"}
export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/usr/lib/hadoop-yarn"}
export HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-"/usr/lib/hadoop-mapreduce"}
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-":/usr/lib/hadoop/lib/native"}
export HADOOP_TOKEN_FILE_LOCATION="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/appcache/application_1664339426994_0003/container_1664339426994_0003_01_000001/container_tokens"
export CONTAINER_ID="container_1664339426994_0003_01_000001"
export NM_PORT="8026"
export NM_HOST="spark-cluster-w-1.c.apache-spark-project-363713.internal"
export NM_HTTP_PORT="8042"
export LOCAL_DIRS="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/appcache/application_1664339426994_0003"
export LOCAL_USER_DIRS="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/"
export LOG_DIRS="/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001"
export USER="aavashbhandari"
export LOGNAME="aavashbhandari"
export HOME="/home/"
export PWD="/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/appcache/application_1664339426994_0003/container_1664339426994_0003_01_000001"
export LOCALIZATION_COUNTERS="563687,0,2,0,1366"
export JVM_PID="$$"
export NM_AUX_SERVICE_spark_shuffle=""
export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
export SPARK_YARN_STAGING_DIR="hdfs://spark-cluster-m/user/aavashbhandari/.sparkStaging/application_1664339426994_0003"
export APP_SUBMIT_TIME_ENV="1664340741193"
export PYSPARK_PYTHON="/opt/conda/default/bin/python"
export PYTHONHASHSEED="0"
export APPLICATION_WEB_PROXY_BASE="/proxy/application_1664339426994_0003"
export SPARK_DIST_CLASSPATH=":/etc/hive/conf:/usr/local/share/google/dataproc/lib/*:/usr/share/java/mysql.jar"
export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:/usr/lib/spark/jars/*::/etc/hive/conf:/usr/local/share/google/dataproc/lib/*:/usr/share/java/mysql.jar:$PWD/__spark_conf__/__hadoop_conf__"
export SPARK_USER="aavashbhandari"
export MALLOC_ARENA_MAX="4"
echo "Setting up job resources"
ln -sf -- "/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/11/cleanSCL2.jar" "__app__.jar"
ln -sf -- "/hadoop/yarn/nm-local-dir/usercache/aavashbhandari/filecache/10/__spark_conf__.zip" "__spark_conf__"
echo "Copying debugging information"
# Creating copy of launch script
cp "launch_container.sh" "/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/launch_container.sh"
chmod 640 "/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/launch_container.sh"
# Determining directory contents
echo "ls -l:" 1>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/directory.info"
ls -l 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/directory.info"
echo "find -L . -maxdepth 5 -ls:" 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/directory.info"
find -L . -maxdepth 5 -ls 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/directory.info"
echo "broken symlinks(find -L . -maxdepth 5 -type l -ls):" 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/directory.info"
find -L . -maxdepth 5 -type l -ls 1>>"/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/directory.info"
echo "Launching container"
exec /bin/bash -c "$JAVA_HOME/bin/java -server -Xmx2048m -Djava.io.tmpdir=$PWD/tmp -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.aavash.ann.sparkann.GraphNetworkSCL' --jar file:/home/aavashbhandari/cleanSCL2.jar --arg 'Oldenburg_Nodes.txt' --arg 'Oldenburg_Edges.txt' --arg 'Oldenburg_part_4.txt' --properties-file $PWD/__spark_conf__/__spark_conf__.properties --dist-cache-conf $PWD/__spark_conf__/__spark_dist_cache__.properties 1> /var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/stdout 2> /var/log/hadoop-yarn/userlogs/application_1664339426994_0003/container_1664339426994_0003_01_000001/stderr"

End of LogType:launch_container.sh
************************************************************************************
  1. 我的 POM 文件:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.aavash.ann</groupId>
    <artifactId>sparkann</artifactId>
    <version>0.0.1-SNAPSHOT</version>

    <name>SparkANN</name>
    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.12</artifactId>
            <version>3.1.2</version>
        </dependency>

        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>2.12.14</version>
        </dependency>

    </dependencies>
    <build>
        <sourceDirectory>src</sourceDirectory>
        <plugins>
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.7.0</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

pom.xml中的以下代碼替換您的構建屬性。 它不會產生脂肪 JAR。

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.aavash.ann</groupId>
<artifactId>sparkann</artifactId>
<version>0.0.1-SNAPSHOT</version>

<name>SparkANN</name>
<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.12</artifactId>
        <version>3.1.2</version>
    </dependency>

    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.12.14</version>
    </dependency>

</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
    <build>
    <sourceDirectory>src/main/scala</sourceDirectory>
    <plugins>
        <plugin>
            <groupId>org.scala-tools</groupId>
            <artifactId>maven-scala-plugin</artifactId>
            <executions>
                <execution>
                    <goals>
                        <goal>compile</goal>
                        <goal>testCompile</goal>
                    </goals>
                </execution>
            </executions>
            <configuration>
                <scalaVersion>${scala.version}</scalaVersion>
                <args>
                    <arg>-target:jvm-1.5</arg>
                </args>
            </configuration>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-eclipse-plugin</artifactId>
            <configuration>
                <downloadSources>true</downloadSources>
                <buildcommands>
                    <buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>
                </buildcommands>
                <additionalProjectnatures>
                    <projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>
                </additionalProjectnatures>
                <classpathContainers>
                    <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>
                    <classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>
                </classpathContainers>
            </configuration>
        </plugin>
        <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <version>2.4.1</version>
            <configuration>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
            </configuration>
            <executions>
                <execution>
                    <id>make-assembly</id>
                    <phase>package</phase>
                    <goals>
                        <goal>single</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

使用以下命令構建它:

mvn 清潔 package

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM