繁体   English   中英

Spark-submit 使用 yarn master 失败,错误要求在 scala.Predef 失败

[英]Spark-submit fails with yarn master, error- requirement failed at scala.Predef

我的 Spark 作业因以下异常而失败,我无法弄清楚缺少什么导致作业失败的要求:

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
        at scala.Predef$.require(Predef.scala:221)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:472)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:470)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:470)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:468)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468)
        at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)

Spark提交命令:

spark-submit --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/xyz/conf/log4j.xml \
-DHOME=/xyz/transformation -DENV=e1 \
-DJOB=xformation --conf spark.local.dir=/warehouse/tmp/spark1489619325 \
--queue dev --master yarn --deploy-mode cluster \
--properties-file /xyz/conf/job.conf \
--files /xyz/conf/e1.properties --class TransformationJob /xyz/job.jar

相同的程序在 master 和 local 上都可以正常工作。

任何建议都会有很大帮助。 提前致谢。

我在类路径中有大量使用“--jars”元素的 jar 列表,其中一个 jar 是罪魁祸首,当我从“--jars”中删除它时,问题得到了解决,我仍然不确定为什么 spark-submit 失败因为那个罐子。

我有由数据文件引起的错误。 我指出了错误的火车数据方向。

训练数据和测试数据不匹配。

训练数据: 0,1 0 0 1 0 0 1 0 0 1

测试数据: 1, 2, 0, 0, 10

我修复了火车数据源方向,问题已经解决。

我收到了类似的错误:

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
        at scala.Predef$.require(Predef.scala:221)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:501)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:499)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8.apply(Client.scala:499)
        at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8.apply(Client.scala:497)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:497)
        at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:763)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1109)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

解决方法是:

在终端或日志上,您将有一个 WARN 行,如下所示:

警告客户端:资源文件:多次添加到分布式缓存中。

只需从 spark submit script 中删除这个额外的 jar。 希望这会帮助所有人。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM