[英]Spark streaming kafka using scala
我正在嘗試使用 Intellij 在 Scala 中構建一個 kafka 消費者,以讀取來自 kafka 主題的消息。我在 windows 上都有 spark 和 kafka。
我試過這段代碼:
import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka010._
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.sql.SparkSession
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
object connector {
def main(args: Array[String]) {
class Kafkaconsumer {
val kafkaParams = Map[String, Object](
"bootstrap.servers" -> "host1:port,host2:port2,host3:port3",
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"group.id" -> "use_a_separate_group_id_for_each_stream",
"auto.offset.reset" -> "latest",
"enable.auto.commit" -> (false: java.lang.Boolean)
)
val sparkConf = new SparkConf().setMaster("yarn")
.setAppName("kafka example")
val streamingContext = new StreamingContext(sparkConf, Seconds(10))
val topics = Array("topicname")
val topicsSet = topics.split(",").toSet
val stream = KafkaUtils.createDirectStream[String, String](
streamingContext,PreferConsistent,Subscribe[String, String](kafkaParams,topicsSet)
)
stream.print()
stream.map(record => (record.key, record.value))
streamingContext.start()
streamingContext.awaitTermination()
}
}
但是我在這兩行中有一個錯誤
val topicsSet = topics.split(",").toSet
streamingContext,PreferConsistent,Subscribe[String, String](kafkaParams,topicsSet)
拆分 function 和訂閱 function 始終為紅色。
任何想法?
謝謝
這段代碼對我很有效。
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.kafka010.KafkaUtils
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
object Conector_kafka_spark extends App {
val conf = new SparkConf()
.setAppName("SparkStreamingKafka010Demo")
.setMaster("local[*]")
.set("spark.streaming.kafka.maxRatePerPartition", "100")
val sc = new SparkContext(conf)
val streamingContext = new StreamingContext(sc, Seconds(10))
val kafkaParams = Map[String, Object](
"bootstrap.servers" -> "localhost:9092",
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"group.id" -> "kafka_demo_group",
"auto.offset.reset" -> "earliest",
"enable.auto.commit" -> (true: java.lang.Boolean)
)
val topics = Array("message")
val stream = KafkaUtils.createDirectStream[String, String](
streamingContext,
PreferConsistent,
Subscribe[String, String](topics, kafkaParams)
)
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.