[英]How to run Kafka as a stream for Apache Spark using Scala 2.11?
I haven't been able to find any any build of Spark Streaming integration for Kafka for Scala 2.11. 我找不到任何适用于Scala 2.11的Kafka的Spark Streaming集成的任何版本。 There is a one available here http://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka_2.10 but it is for 2.10
http://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka_2.10上有一个可用,但适用于2.10。
Can anyone point me to a 2.11 build? 谁能指出我的2.11版本?
It's not feasible to run Spark Kafka against Scala 2.11 by now ( Spark-1.3
) 目前针对Scala 2.11运行Spark Kafka是不可行的(
Spark-1.3
)
If no pre-build version available, you could build spark yourself and fulfil your needs by specifying some build parameters. 如果没有可用的预构建版本,则可以通过指定一些构建参数来构建自己并满足您的需求。
The detailed build procedure could be find: Building Spark 可以找到详细的构建过程: 构建Spark
In short, It would only take two steps if build against scala-2.10: 简而言之,如果针对scala-2.10进行构建,则只需采取两个步骤:
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
mvn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package
You should specify profiles or properties that fits your situation in the second command 您应该在第二个命令中指定适合您情况的配置文件或属性
Note the part Building Spark states on Building for Scala 2.11
: 注意在Build
Building for Scala 2.11
上的Building Spark状态部分:
To produce a Spark package compiled with Scala 2.11, use the -Dscala-2.11 property:
要生成使用Scala 2.11编译的Spark软件包,请使用-Dscala-2.11属性:
dev/change-version-to-2.11.sh
mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package
Scala 2.11 support in Spark does not support a few features due to dependencies which are themselves not Scala 2.11 ready.
由于依赖项本身还不支持Scala 2.11,因此Spark中对Scala 2.11的支持不支持某些功能。 Specifically, Spark's external Kafka library and JDBC component are not yet supported in Scala 2.11 builds.
具体来说,Scala 2.11构建中尚不支持Spark的外部Kafka库和JDBC组件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.