簡體   English   中英

Spark-aggregateByKey 類型不匹配錯誤

[英]Spark - aggregateByKey Type mismatch error

我試圖找到這背后的問題。 我正在嘗試使用aggregateByKey找到每個學生的最大標記數。

val data = spark.sc.Seq(("R1","M",22),("R1","E",25),("R1","F",29),
                        ("R2","M",20),("R2","E",32),("R2","F",52))
                   .toDF("Name","Subject","Marks")
def seqOp = (acc:Int,ele:(String,Int)) => if (acc>ele._2) acc else ele._2
def combOp =(acc:Int,acc1:Int) => if(acc>acc1) acc else acc1

val r = data.rdd.map{case(t1,t2,t3)=> (t1,(t2,t3))}.aggregateByKey(0)(seqOp,combOp)

我收到錯誤, aggregateByKey接受(Int,(Any,Any))但實際是(Int,(String,Int))

您的地圖功能不正確,因為您有一個Row作為輸入,而不是Tuple3

修復最后一行:

val r = data.rdd.map { r =>
      val t1 = r.getAs[String](0)
      val t2 = r.getAs[String](1)
      val t3 = r.getAs[Int](2)
      (t1,(t2,t3))
    }.aggregateByKey(0)(seqOp,combOp)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM