简体   繁体   中英

sbt-assembly not including dependencies

I am trying to build a fat jar to send to spark-submit using sbt assembly. However, I cannot seem to get the build process right.

My current build.sbt is as follows

name := "MyAppName"

version := "1.0"

scalaVersion := "2.10.6"


libraryDependencies  ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.6.0" % "provided",
  "org.apache.spark" %% "spark-mllib" % "1.6.0" % "provided",
  "org.scalanlp" %% "breeze" % "0.12",
  "org.scalanlp" %% "breeze-natives" % "0.12"
)

resolvers ++= Seq(
  "Sonatype Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/"
)

Running sbt-sssembly produces a jar. However, after submitting the jar to spark-submit spark-submit MyAppName-assembly-1.0.jar (there's already a main class specified so I'm assuming its ok I don't specify a class), the following exception gets thrown:

java.lang.NoSuchMethodError: breeze.linalg.DenseVector.noOffsetOrStride()Z
at breeze.linalg.DenseVector$canDotD$.apply(DenseVector.scala:629)
at breeze.linalg.DenseVector$canDotD$.apply(DenseVector.scala:626)
at breeze.linalg.ImmutableNumericOps$class.dot(NumericOps.scala:98)
at breeze.linalg.DenseVector.dot(DenseVector.scala:50)
at RunMe$.cosSimilarity(RunMe.scala:103)
at RunMe$$anonfun$4.apply(RunMe.scala:35)
at RunMe$$anonfun$4.apply(RunMe.scala:33)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
at org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:658)
at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1377)
at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1374)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

I'm relatively new to the world of scala and sbt, so any help would be greatly appreciated!

So it turns out the issue is that breeze is already included in spark. The issue was that spark contained a newer Breeze version with methods that my version didn't have.

My reference: Apache Spark - java.lang.NoSuchMethodError: breeze.linalg.DenseVector

I had a similar problem. I ended up saving the jar under lib directory then in assembly.sbt add :

unmanagedJars in Compile += file("lib/my.jar")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM