繁体   English   中英

使用 scala 的 Spark 流式 kafka

[英]Spark streaming kafka using scala

我正在尝试使用 Intellij 在 Scala 中构建一个 kafka 消费者,以读取来自 kafka 主题的消息。我在 windows 上都有 spark 和 kafka。

我试过这段代码:

import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka010._
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.sql.SparkSession
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe

 object connector {
    def main(args: Array[String]) {
     class Kafkaconsumer {
     val kafkaParams = Map[String, Object](
    "bootstrap.servers" -> "host1:port,host2:port2,host3:port3",
    "key.deserializer" -> classOf[StringDeserializer],
    "value.deserializer" -> classOf[StringDeserializer],
    "group.id" -> "use_a_separate_group_id_for_each_stream",
    "auto.offset.reset" -> "latest",
    "enable.auto.commit" -> (false: java.lang.Boolean)
  )
  val sparkConf = new SparkConf().setMaster("yarn")
    .setAppName("kafka example")
  val streamingContext = new StreamingContext(sparkConf, Seconds(10))
  val topics = Array("topicname")
  val topicsSet = topics.split(",").toSet
  val stream = KafkaUtils.createDirectStream[String, String](
    streamingContext,PreferConsistent,Subscribe[String, String](kafkaParams,topicsSet)
  )
  stream.print()
  stream.map(record => (record.key, record.value))
  streamingContext.start()
  streamingContext.awaitTermination()

}
}

但是我在这两行中有一个错误

 val topicsSet = topics.split(",").toSet
 streamingContext,PreferConsistent,Subscribe[String, String](kafkaParams,topicsSet)

拆分 function 和订阅 function 始终为红色。

任何想法?

谢谢

这段代码对我很有效。

 import org.apache.kafka.common.serialization.StringDeserializer
 import org.apache.spark.{SparkConf, SparkContext}
 import org.apache.spark.streaming.{Seconds, StreamingContext}
 import org.apache.spark.streaming.kafka010.KafkaUtils
 import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
 import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
 object Conector_kafka_spark extends App {
       val conf = new SparkConf()
      .setAppName("SparkStreamingKafka010Demo")
      .setMaster("local[*]")
      .set("spark.streaming.kafka.maxRatePerPartition", "100")
      val sc = new SparkContext(conf)
      val streamingContext = new StreamingContext(sc, Seconds(10))
      val kafkaParams = Map[String, Object](
     "bootstrap.servers" -> "localhost:9092",
     "key.deserializer" -> classOf[StringDeserializer],
     "value.deserializer" -> classOf[StringDeserializer],
     "group.id" -> "kafka_demo_group",
     "auto.offset.reset" -> "earliest",
      "enable.auto.commit" -> (true: java.lang.Boolean)
     )

    val topics = Array("message")
    val stream = KafkaUtils.createDirectStream[String, String](
    streamingContext,
    PreferConsistent,
    Subscribe[String, String](topics, kafkaParams)
 )

}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM