簡體   English   中英

Apache Spark 與 kafka stream - 缺少 Kafka

[英]Apache Spark with kafka stream - Missing Kafka

我試圖用 kafka 設置 Apache Spark,並在本地編寫了簡單的程序,但它失敗了,無法從調試中找出答案。

build.gradle.kts

implementation ("org.jetbrains.kotlin:kotlin-stdlib:1.4.0")
implementation ("org.jetbrains.kotlinx.spark:kotlin-spark-api-3.0.0_2.12:1.0.0-preview1")
compileOnly("org.apache.spark:spark-sql_2.12:3.0.0")
implementation("org.apache.kafka:kafka-clients:3.0.0")

主要function密碼是

val spark = SparkSession
    .builder()
    .master("local[*]")
    .appName("Ship metrics").orCreate

        val shipmentDataFrame = spark
            .readStream()
            .format("kafka")
            .option("kafka.bootstrap.servers", "localhost:9092")
            .option("subscribe", "test")
            .option("includeHeaders", "true")
            .load()

      val query =  shipmentDataFrame.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")

        query.writeStream()
            .format("console")
            .outputMode("append")
            .start()
            .awaitTermination()

並收到錯誤:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".;
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:666)
    at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:194)
    at com.tgt.ff.axon.shipmetriics.stream.ShipmentStream.run(ShipmentStream.kt:23)
    at com.tgt.ff.axon.shipmetriics.ApplicationKt.main(Application.kt:12)
21/12/25 22:22:56 INFO SparkContext: Invoking stop() from shutdown hook 

JetBrains 的 Kotlin API for Spark ( https://github.com/Kotlin/kotlin-spark-api ) 自 1.1.0 更新以來就支持流式傳輸。 還有一個 Kafka stream 的例子可能對你有幫助: https://github.com/Kotlin/kotlin-spark-api/blob/spark-3.2/examples/src/main/kotlin/org/jetbrains /kotlinx/spark/examples/streaming/KotlinDirectKafkaWordCount.kt

它確實使用了Spark DStream API而不是您似乎正在使用的Spark Structured Streaming API

當然,如果您願意,您仍然可以使用結構化流式傳輸,但是需要像此處描述的那樣進行部署。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM