简体   繁体   English

build.sbt:如何添加Spark依赖项

[英]build.sbt: how to add spark dependencies

Hello I am trying to download spark-core , spark-streaming , twitter4j , and spark-streaming-twitter in the build.sbt file below: 您好,我正在尝试在下面的build.sbt文件中下载spark-corespark-streamingtwitter4jspark-streaming-twitter

name := "hello"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1"
libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.4.1"

libraryDependencies ++= Seq(
  "org.twitter4j" % "twitter4j-core" % "3.0.3",
  "org.twitter4j" % "twitter4j-stream" % "3.0.3"
)

libraryDependencies += "org.apache.spark" % "spark-streaming-twitter_2.10" % "0.9.0-incubating"

I simply took this libraryDependencies online so I am not sure which versions, etc. to use. 我只是将此libraryDependencies在线上,所以我不确定要使用哪个版本等。

Can someone please explain to me how I should fix this .sbt files. 有人可以向我解释如何修复此.sbt文件。 I spent a couple hours trying to figure it out but none of the suggesstion worked. 我花了几个小时试图找出答案,但没有一个建议起作用。 I installed scala through homebrew and I am on version 2.11.8 我通过自制软件安装了scala ,并且版本为2.11.8

All of my errors were about: 我所有的错误都与:

Modules were resolved with conflicting cross-version suffixes.

The problem is that you are mixing Scala 2.11 and 2.10 artifacts. 问题在于您正在混合Scala 2.11和2.10工件。 You have: 你有:

scalaVersion := "2.11.8"

And then: 接着:

libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.4.1"

Where the 2.10 artifact is being required. 需要2.10工件的位置。 You are also mixing Spark versions instead of using a consistent version: 您还将混合使用Spark版本,而不是使用一致版本:

// spark 1.6.1
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1"

// spark 1.4.1
libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.4.1"

// spark 0.9.0-incubating
libraryDependencies += "org.apache.spark" % "spark-streaming-twitter_2.10" % "0.9.0-incubating"

Here is a build.sbt that fixes both problems: 这是一个build.sbt ,可以解决两个问题:

name := "hello"

version := "1.0"

scalaVersion := "2.11.8"

val sparkVersion = "1.6.1"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-streaming" % sparkVersion,
  "org.apache.spark" %% "spark-streaming-twitter" % sparkVersion
)

You also don't need to manually add twitter4j dependencies since they are added transitively by spark-streaming-twitter . 您还不需要手动添加twitter4j依赖项,因为它们是由spark-streaming-twitter过渡添加的。

It works for me: 这个对我有用:

name := "spark_local"

version := "0.1"

scalaVersion := "2.11.8"


libraryDependencies ++= Seq(
  "org.twitter4j" % "twitter4j-core" % "3.0.5",
  "org.twitter4j" % "twitter4j-stream" % "3.0.5",
  "org.apache.spark" %% "spark-core" % "2.0.0",
  "org.apache.spark" %% "spark-sql" % "2.0.0",
  "org.apache.spark" %% "spark-mllib" % "2.0.0",
  "org.apache.spark" %% "spark-streaming" % "2.0.0"
)

I encountered similar issues and tried your methods above. 我遇到了类似的问题,并尝试了上述方法。 However I got the following warnings... 但是我收到以下警告...

[warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings.
[warn] Found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[warn]  * io.netty:netty:3.9.9.Final is selected over {3.6.2.Final, 3.7.0.Final}
[warn]      +- org.apache.spark:spark-core_2.11:2.4.4             (depends on 3.9.9.Final)
[warn]      +- org.apache.zookeeper:zookeeper:3.4.6               (depends on 3.6.2.Final)
[warn]      +- org.apache.hadoop:hadoop-hdfs:2.6.5                (depends on 3.6.2.Final)
[warn]  * com.google.guava:guava:11.0.2 is selected over {12.0.1, 16.0.1}
[warn]      +- org.apache.hadoop:hadoop-yarn-client:2.6.5         (depends on 11.0.2)
[warn]      +- org.apache.hadoop:hadoop-yarn-api:2.6.5            (depends on 11.0.2)
[warn]      +- org.apache.hadoop:hadoop-yarn-common:2.6.5         (depends on 11.0.2)
[warn]      +- org.apache.hadoop:hadoop-yarn-server-nodemanager:2.6.5 (depends on 11.0.2)
[warn]      +- org.apache.hadoop:hadoop-yarn-server-common:2.6.5  (depends on 11.0.2)
[warn]      +- org.apache.hadoop:hadoop-hdfs:2.6.5                (depends on 11.0.2)
[warn]      +- org.apache.curator:curator-framework:2.6.0         (depends on 16.0.1)
[warn]      +- org.apache.curator:curator-client:2.6.0            (depends on 16.0.1)
[warn]      +- org.apache.curator:curator-recipes:2.6.0           (depends on 16.0.1)
[warn]      +- org.apache.hadoop:hadoop-common:2.6.5              (depends on 16.0.1)
[warn]      +- org.htrace:htrace-core:3.0.4                       (depends on 12.0.1)

Any idea how to resolve? 任何想法如何解决? I downloaded the latest Spark 2.4.4 package, using Hortonworks Sandbox 3.0 , sbt 1.3.3 and scala 2.11.12 . 我使用Hortonworks Sandbox 3.0sbt 1.3.3scala 2.11.12下载了最新的Spark 2.4.4软件包。 Thank you very much! 非常感谢你!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM