简体   繁体   English

scala sbt libraryDependencies 提供 - 避免下载第 3 方库

[英]scala sbt libraryDependencies provided - Avoid downloading 3rd party library

I've the following Spark Scala code that references 3rd party libraries,我有以下引用 3rd 方库的 Spark Scala 代码,

package com.protegrity.spark

import org.apache.spark.sql.api.java.UDF2
import com.protegrity.spark.udf.ptyProtectStr
import com.protegrity.spark.udf.ptyProtectInt

class ptyProtectStr extends UDF2[String, String, String] {
  
  def call(input: String, dataElement: String): String = {
    return ptyProtectStr(input, dataElement);
  }
}

class ptyUnprotectStr extends UDF2[String, String, String] {
  
  def call(input: String, dataElement: String): String = {
    return ptyUnprotectStr(input, dataElement);
  }
}

class ptyProtectInt extends UDF2[Integer, String, Integer] {
  
  def call(input: Integer, dataElement: String): Integer = {
    return ptyProtectInt(input, dataElement);
  }
}

class ptyUnprotectInt extends UDF2[Integer, String, Integer] {
       
       def call(input: Integer, dataElement: String): Integer = {
                     return ptyUnprotectInt(input, dataElement);
       }
}

I want to create JAR file using SBT.我想使用 SBT 创建 JAR 文件。 My build.sbt looks like the following,我的 build.sbt 如下所示,

name := "Protegrity UDF"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies ++= Seq(
    "com.protegrity.spark" % "udf" % "2.3.2" % "provided",
    "org.apache.spark" %% "spark-core" % "2.3.2" % "provided",
    "org.apache.spark" %% "spark-sql" % "2.3.2" % "provided"
)

As you see, I trying to create a thin JAR file using "provided" option as my Spark environment already contains those libraries.如您所见,我尝试使用“提供”选项创建一个精简的 JAR 文件,因为我的 Spark 环境已经包含这些库。

In spite of using "provided", sbt is trying to download from maven and throwing below error,尽管使用“提供”,sbt 试图从 maven 下载并抛出以下错误,

[warn]  Note: Unresolved dependencies path:
[error] sbt.librarymanagement.ResolveException: Error downloading com.protegrity.spark:udf:2.3.2
[error]   Not found
[error]   Not found
[error]   not found: C:\Users\user1\.ivy2\local\com.protegrity.spark\udf\2.3.2\ivys\ivy.xml
[error]   not found: https://repo1.maven.org/maven2/com/protegrity/spark/udf/2.3.2/udf-2.3.2.pom
[error]         at lmcoursier.CoursierDependencyResolution.unresolvedWarningOrThrow(CoursierDependencyResolution.scala:249)
[error]         at lmcoursier.CoursierDependencyResolution.$anonfun$update$35(CoursierDependencyResolution.scala:218)
[error]         at scala.util.Either$LeftProjection.map(Either.scala:573)
[error]         at lmcoursier.CoursierDependencyResolution.update(CoursierDependencyResolution.scala:218)
[error]         at sbt.librarymanagement.DependencyResolution.update(DependencyResolution.scala:60)
[error]         at sbt.internal.LibraryManagement$.resolve$1(LibraryManagement.scala:52)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$12(LibraryManagement.scala:102)
[error]         at sbt.util.Tracked$.$anonfun$lastOutput$1(Tracked.scala:69)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$20(LibraryManagement.scala:115)
[error]         at scala.util.control.Exception$Catch.apply(Exception.scala:228)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11(LibraryManagement.scala:115)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11$adapted(LibraryManagement.scala:96)
[error]         at sbt.util.Tracked$.$anonfun$inputChanged$1(Tracked.scala:150)
[error]         at sbt.internal.LibraryManagement$.cachedUpdate(LibraryManagement.scala:129)
[error]         at sbt.Classpaths$.$anonfun$updateTask0$5(Defaults.scala:2950)
[error]         at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error]         at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62)
[error]         at sbt.std.Transform$$anon$4.work(Transform.scala:67)
[error]         at sbt.Execute.$anonfun$submit$2(Execute.scala:281)
[error]         at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:19)
[error]         at sbt.Execute.work(Execute.scala:290)
[error]         at sbt.Execute.$anonfun$submit$1(Execute.scala:281)
[error]         at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:178)
[error]         at sbt.CompletionService$$anon$2.call(CompletionService.scala:37)
[error]         at java.util.concurrent.FutureTask.run(Unknown Source)
[error]         at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
[error]         at java.util.concurrent.FutureTask.run(Unknown Source)
[error]         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
[error]         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
[error]         at java.lang.Thread.run(Unknown Source)
[error] (update) sbt.librarymanagement.ResolveException: Error downloading com.protegrity.spark:udf:2.3.2
[error]   Not found
[error]   Not found
[error]   not found: C:\Users\user1\.ivy2\local\com.protegrity.spark\udf\2.3.2\ivys\ivy.xml
[error]   not found: https://repo1.maven.org/maven2/com/protegrity/spark/udf/2.3.2/udf-2.3.2.pom

What change in build.sbt should I make to skip the maven download for "com.protegrity.spark"?我应该对 build.sbt 进行哪些更改以跳过“com.protegrity.spark”的 maven 下载? Interestingly, I don't face this issue for "org.apache.spark" on the same build有趣的是,我在同一个版本中没有遇到“org.apache.spark”这个问题

Assuming that you have the JAR file available (but not through Maven or another artifact repository) wherever you're compiling the code, just place the JAR in (by default) the lib directory within your project (the path can be changed with the unmanagedBase setting in build.sbt if you need to do that for some reason). Assuming that you have the JAR file available (but not through Maven or another artifact repository) wherever you're compiling the code, just place the JAR in (by default) the lib directory within your project (the path can be changed with the unmanagedBase如果出于某种原因需要这样做,请在build.sbt中设置)。

Note that this will result in the unmanaged JAR being included in an assembly JAR.请注意,这将导致非托管 JAR 包含在组件 JAR 中。 If you want to build a "slightly less fat" JAR that excludes the unmanaged JAR, you'll have to filter it out.如果你想构建一个不包括非托管 JAR 的“略胖”的 JAR,则必须将其过滤掉。 One way to accomplish this is with实现这一目标的一种方法是

assemblyExcludedJars in assembly := {
  val cp = (fullClasspath in assembly).value
  cp.filter(_.data.getName == "name-of-unmanaged.jar")
}

If you don't have the JAR (or perhaps something very close to the JAR) handy, how exactly do you expect the compiler to typecheck your calls into the JAR?如果您手边没有 JAR(或者可能非常接近 JAR),您希望编译器如何对 JAR 的调用进行类型检查?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM