[英]Spark in AKS. Error: Could not find or load main class org.apache.spark.launcher.Main
Update 1: After adding missing pieces and env variables from Spark installation - Error: Could not find or load main class org.apache.spark.launcher.Main , the command no longer throws an error, but prints itself and doesn't do anything else.更新 1:从Spark 安装中添加缺失的部分和环境变量后- 错误:无法找到或加载主类 org.apache.spark.launcher.Main ,该命令不再抛出错误,但会自行打印并且不执行任何操作别的。 This is the new result of running the command:这是运行命令的新结果:
"C:\Program Files\Java\jdk1.8.0_271\bin\java" -cp "C:\Users\xxx\repos\spark/conf\;C:\Users\xxx\repos\spark\assembly\target\scala-2.12\jars\*" org.apache.spark.deploy.SparkSubmit --master k8s://http://127.0.0.1:8001 --deploy-mode cluster --conf "spark.kubernetes.container.image=xxx.azurecr.io/spark:spark2.4.5_scala2.12.12" --conf "spark.kubernetes.authenticate.driver.serviceAccountName=spark" --conf "spark.executor.instances=3" --class com.xxx.bigdata.xxx.XMain --name xxx_app https://storage.blob.core.windows.net/jars/xxx.jar
I have been following this guide for setting up Spark in AKS: https://docs.microsoft.com/en-us/azure/aks/spark-job .我一直在按照本指南在 AKS 中设置 Spark: https : //docs.microsoft.com/en-us/azure/aks/spark-job 。 I am using Spark tag 2.4.5 with scala 2.12.12.我将 Spark 标签 2.4.5 与 Scala 2.12.12 一起使用。 I have done all the following steps:我已经完成了以下所有步骤:
./bin/spark-submit \
--master k8s://http://127.0.0.1:8001 \
--deploy-mode cluster \
--name xxx_app\
--class com.xxx.bigdata.xxx.XMain\
--conf spark.executor.instances=3 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=xxx.azurecr.io/spark:spark2.4.5_scala2.12.12 \
"https://storage.blob.core.windows.net/jars/xxx.jar"
All I am getting is Error: Could not find or load main class org.apache.spark.launcher.Main
我得到的只是Error: Could not find or load main class org.apache.spark.launcher.Main
Now, the funny thing is that it doesn't matter at all what I change in the command.现在,有趣的是,我在命令中更改了什么根本无关紧要。 I can mess up ACR address, spark image name, jar location, api-server address, anything, and I still get the same error.我可以弄乱 ACR 地址、火花图像名称、jar 位置、api 服务器地址等等,但我仍然得到同样的错误。
I guess I must be making some silly mistake as it seems nothing can break the command more than it already is, but I can't really nail it down.我想我一定犯了一些愚蠢的错误,因为似乎没有什么比现在更能破坏命令了,但我无法真正确定它。 Does someone have some ideas what might be wrong?有人有什么想法可能是错的吗?
Looks like it might be a problem on the machine you are executing spark-submit
.看起来这可能是您正在执行的机器上的问题spark-submit
。 You might be missing some jars on the classpath on the machine you are executing spark-submit
.您可能在执行spark-submit
的机器上的类路径上丢失了一些 jar。 Worth checking out Spark installation - Error: Could not find or load main class org.apache.spark.launcher.Main值得检查Spark 安装 - 错误:无法找到或加载主类 org.apache.spark.launcher.Main
Alright, so I managed to submit jobs with spark-submit.cmd, instead.好的,所以我设法使用 spark-submit.cmd 提交作业。 It works, without any additional setup.它可以工作,无需任何额外设置。
I didn't manage to get the bash script to work in the end and I do not have the time to investigate it further at this moment.我最终没有设法让 bash 脚本工作,目前我没有时间进一步调查它。 So, sorry for providing a half-assed answer only partially resolving original problem, but it is a solution nonetheless.因此,很抱歉提供了一个半途而废的答案,仅部分解决了原始问题,但它仍然是一个解决方案。
The below command works fine下面的命令工作正常
bin\spark-submit.cmd --master k8s://http://127.0.0.1:8001 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=3 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark --conf spark.kubernetes.namespace=dev --conf spark.kubernetes.container.image=xxx.azurecr.io/spark:spark-2.4.5_scala-2.12_hadoop-2.7.7 https://xxx.blob.core.windows.net/jars/SparkPi-assembly-0.1.0-SNAPSHOT.jar
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.