簡體   English   中英

線程“ main” org.apache.spark.sql.AnalysisException中的異常:

[英]Exception in thread “main” org.apache.spark.sql.AnalysisException:

我正在嘗試進行Kafka Spark結構化流式傳輸,但是遇到一些異常,例如線程“ main”中的異常org.apache.spark.sql.AnalysisException:給定輸入列,無法解析' device ':[值,偏移量,分區,鍵,時間戳,timestampType,主題];

附加我的代碼

import org.apache.spark.sql._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.types._
import org.apache.spark.sql.streaming.ProcessingTime
case class DeviceData(device: String, deviceType: String, signal: String)

object dataset_kafka {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession
          .builder()
          .appName("kafka-consumer")
          .master("local[*]")
          .getOrCreate()
        import spark.implicits._

       spark.sparkContext.setLogLevel("WARN")


    val df = spark
        .readStream
        .format("kafka")
        .option("kafka.bootstrap.servers", "172.21.0.187:9093")
        .option("subscribe", "test")
        .option("startingOffsets", "earliest")
        .load()
        println(df.isStreaming)
        println(df.printSchema())

    val ds: Dataset[DeviceData] = df.as[DeviceData]

    val values = df.select("device").where("signal == Strong")

    values.writeStream
          .outputMode("append")
          .format("console")
          .start()
            .awaitTermination()


  }
}

任何幫助如何解決這個問題?

Kafka流始終產生以下字段: valueoffsetpartitionkeytimestamptimestampTypetopic 在您的情況下,您對value感興趣,但是請注意, 值始終會反序列化為字節數組 ,因此,在反序列化JSON之前需要將類型強制轉換為字符串。

嘗試以下代碼:

import spark.implicits._

val kafkaStream =
  spark.readStream
    .format("kafka")
    .option("kafka.bootstrap.servers", "172.21.0.187:9093")
    .option("subscribe", "test")
    .option("startingOffsets", "earliest")
    .load()

// If you don't want to build the schema manually
import org.apache.spark.sql.Encoders
val schema = Encoders.product[DeviceData].schema

import org.apache.spark.sql.functions.from_json
val ds = kafkaStream.select(from_json($"value" cast "string", schema)).as[DeviceData]

val values = ds.filter(_.signal == "Strong").map(_.device)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM