[英]Can't shade jars by shade plugin of SBT
我Deduplicate found... error
使用 SBT 构建项目时出错:
[error] Deduplicate found different file contents in the following:
[error] Jar name = netty-all-4.1.68.Final.jar, jar org = io.netty, entry target = io/netty/handler/ssl/SslProvider.class
[error] Jar name = netty-handler-4.1.50.Final.jar, jar org = io.netty, entry target = io/netty/handler/ssl/SslProvider.class
...
现在我考虑对所有库进行着色的选项(如此处):
libraryDependencies ++= Seq(
"com.rometools" % "rome" % "1.18.0",
"com.typesafe.scala-logging" %% "scala-logging" % "3.9.5", // log
"ch.qos.logback" % "logback-classic" % "1.4.5", // log
"com.lihaoyi" %% "upickle" % "1.6.0", // file-io
"net.liftweb" %% "lift-json" % "3.5.0", // json
"org.apache.spark" %% "spark-sql" % "3.2.2", // spark
"org.apache.spark" %% "spark-core" % "3.2.2" % "provided", // spark
"org.postgresql" % "postgresql" % "42.5.1", // spark + postgresql
)
所以我添加了以下阴影规则:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.lihaoyi.**" -> "crdaa.@1")
.inLibrary("com.lihaoyi" %% "upickle" % "1.6.0")
.inProject,
ShadeRule.rename("ch.qos.logback.**" -> "crdbb.@1")
.inLibrary("ch.qos.logback" % "logback-classic" % "1.4.5")
.inProject,
ShadeRule.rename("com.typesafe.**" -> "crdcc.@1")
.inLibrary("com.typesafe.scala-logging" %% "scala-logging" % "3.9.5")
.inProject,
ShadeRule.rename("org.apache.spark.spark-sql.**" -> "crddd.@1")
.inLibrary("org.apache.spark" %% "spark-sql" % "3.2.2")
.inProject,
ShadeRule.rename("org.apache.spark.spark-core.**" -> "crdee.@1")
.inLibrary("org.apache.spark" %% "spark-core" % "3.2.2")
.inProject,
ShadeRule.rename("com.rometools.**" -> "crdff.@1")
.inLibrary("com.rometools" % "rome" % "1.18.0")
.inProject,
ShadeRule.rename("org.postgresql.postgresql.**" -> "crdgg.@1")
.inLibrary("org.postgresql" % "postgresql" % "42.5.1")
.inProject,
ShadeRule.rename("net.liftweb.**" -> "crdhh.@1")
.inLibrary("net.liftweb" %% "lift-json" % "3.5.0")
.inProject,
)
但是在我开始assembly
时重新加载 SBT 后,我得到了与重复项相同的错误。
这里有什么问题?
附言:
ThisBuild / scalaVersion := "2.13.10"
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "2.1.0")
更新
最后,我通过设置provided
的选项放弃了重命名,转而使用unmanagedJars
+ 不包括 spark 依赖项(大多数错误都是由它们引起的)。
之后只剩下带有module-info.class
Deduplicate-errors,但它的解决方案(通过更改合并策略)在sbt-assembly-doc中有所描述。
也就是说,我单独下载了 spark,将他们的 jars 复制到./jarlib
目录中(!!!不在./lib
目录中),在 build conf 中更改了以下内容:
libraryDependencies ++= Seq(
//...
"org.apache.spark" %% "spark-sql" % "3.2.3" % "provided",
"org.apache.spark" %% "spark-core" % "3.2.3" % "provided",
)
unmanagedJars in Compile += file("./jarlib")
ThisBuild / assemblyMergeStrategy := {
case PathList("module-info.class") => MergeStrategy.discard
case x if x.endsWith("/module-info.class") => MergeStrategy.discard
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
}
最终 jar 中包含了 Spark-jars
更新 2
正如评论中所述, unmanagedJars
在这种情况下是无用的 - 所以我从build.sbt
中删除了unmanagedJars
字符串
当您启动jar
时,未包含在最终 jar 文件中的注意Spark-jars
应该位于类路径中。
在我的例子中,我将Spark-jars
+ final jar
复制到文件夹./app
并通过以下方式启动jar
:
java -cp "./app/*" main.Main
...其中main.Main
是主类。
有时像这样(放在你的build.sbt
中)是你通常如何删除当你的库有自己的重叠库时出现的重复数据删除:
assemblyMergeStrategy in assembly := {
case PathList("javax", "activation", _*) => MergeStrategy.first
case PathList("com", "sun", _*) => MergeStrategy.first
case "META-INF/io.netty.versions.properties" => MergeStrategy.first
case "META-INF/mime.types" => MergeStrategy.first
case "META-INF/mailcap.default" => MergeStrategy.first
case "META-INF/mimetypes.default" => MergeStrategy.first
case d if d.endsWith(".jar:module-info.class") => MergeStrategy.first
case d if d.endsWith("module-info.class") => MergeStrategy.first
case d if d.endsWith("/MatchersBinder.class") => MergeStrategy.discard
case d if d.endsWith("/ArgumentsProcessor.class") => MergeStrategy.discard
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.