简体   繁体   中英

sbt: using local jar without breaking the dependencies

I am building an application that uses Spark and Spark-mllib, the build.sbt states the dependencies as followings:

  3   libraryDependencies ++= Seq(
  4     "org.apache.spark" %% "spark-core" % "1.6.0" withSources() withJavadoc(),
  5     "org.apache.spark" %% "spark-mllib" % "1.6.0" withSources() withJavadoc()
  6     )

This works fine. Now I would like to change some code in mllib and recompile the application using sbt and here is what I did:

  1. Download the source code of spark-1.6.0, modify code in mllib and recompile it into a jar named spark-mllib_2.10-1.6.0.jar
  2. Put the aforementioned jar into the lib directory of the project.
  3. Also put the spark-core_2.10-1.6.0.jar into the lib directory of the project.
  4. Delete the libraryDependencies statement in the build.sbt file.
  5. run sbt clean package

However, this doesn't compile because of the missing dependencies which spark-core and spark-mllib need in order to run, the depencencies are managed by sbt automatically only if the statement of libraryDependencies is written in the file of build.sbt.

So I put the statement of libraryDependencies back in build.sbt hoping that sbt would solve the dependency issues and still use local spark-mllib instead of the one from the remote repository. However, running my application showed that it was not the case.

So I am wondering if there is a way to use my local spark-mllib jar without manually resolve the dependency issues?

UPDATE: I followed the first approach of Roberto Congiu's answer, and successfully built the package using following build.sbt:

  1 lazy val commonSettings = Seq(
  2   scalaVersion := "2.10.5",
  3   libraryDependencies ++= Seq(
  4     "org.apache.spark" %% "spark-core" % "1.6.0" withSources() withJavadoc(),
  5     "org.apache.spark" %% "spark-streaming" % "1.6.0" withSources() withJavadoc(),
  6     "org.apache.spark" %% "spark-sql" % "1.6.0" withSources() withJavadoc(),
  7     "org.scalanlp" %% "breeze" % "0.11.2"
  8   )
  9 )
 10 lazy val core = project.
 11   settings(commonSettings: _*).
 12   settings(
 13     name := "xSpark",
 14     version := "0.01"
 15   ) 
 16   
 17 lazy val example = project.
 18   settings(commonSettings: _*).
 19   settings(
 20     name := "xSparkExample",
 21     version := "0.01"
 22   ).
 23   dependsOn(core)

xSparkExample includes a KMeans example which calls xSpark, and xSpark calls the KMeans function in spark-mllib. This spark-mllib is a customized jar which I put in the directory of core/lib so that sbt can pick it up as a local dependency.

However, running my application still doesn't use the customized jar for some reason. I even use find . -name "spark-mllib_2.10-1.6.0.jar" find . -name "spark-mllib_2.10-1.6.0.jar" to make sure there is no other jar existed on my system.

One way to do it is to have your custom mlib as a unmanaged dependency. Unmanaged dependencies are put in a directory and SBT will pick them up as they are, so you are responsible to also provide their dependencies. You can read about unmanaged dependencies here : http://www.scala-sbt.org/0.13/docs/Library-Dependencies.html

So, you can try the following:

  1. create a lib directory and add your custom mlib jar there. That's the default location for unmanaged libs and sbt will pick it up automatically
  2. in your build.sbt, remove the reference to mlib, and add all its dependencies, which are listed in the pom here: https://github.com/apache/spark/blob/master/mllib/pom.xml . You can skip the ones with test scope.

Another way to do it is to have your own maven repo (artifactory) where to put your custom artifacts, and have sbt pull from that repository first. This has the advantage that other people will be able to build the code and use your custom mlib library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM