Spark-aggregateByKey 类型不匹配错误

Question

I am trying find the problem behind this.我试图找到这背后的问题。 I am trying to find the maximum number Marks of each student using aggregateByKey .我正在尝试使用aggregateByKey找到每个学生的最大标记数。

val data = spark.sc.Seq(("R1","M",22),("R1","E",25),("R1","F",29),
                        ("R2","M",20),("R2","E",32),("R2","F",52))
                   .toDF("Name","Subject","Marks")
def seqOp = (acc:Int,ele:(String,Int)) => if (acc>ele._2) acc else ele._2
def combOp =(acc:Int,acc1:Int) => if(acc>acc1) acc else acc1

val r = data.rdd.map{case(t1,t2,t3)=> (t1,(t2,t3))}.aggregateByKey(0)(seqOp,combOp)

I am getting error that aggregateByKey accepts (Int,(Any,Any)) but actual is (Int,(String,Int)) .我收到错误， aggregateByKey接受(Int,(Any,Any))但实际是(Int,(String,Int)) 。

Answer 1

Your map function is incorrect since you have a Row as input, not a Tuple3您的地图功能不正确，因为您有一个Row作为输入，而不是Tuple3

Fix the last line with :修复最后一行：

val r = data.rdd.map { r =>
      val t1 = r.getAs[String](0)
      val t2 = r.getAs[String](1)
      val t3 = r.getAs[Int](2)
      (t1,(t2,t3))
    }.aggregateByKey(0)(seqOp,combOp)

Spark-aggregateByKey 类型不匹配错误

问题描述

1 个解决方案

解决方案1
1 2019-12-20 19:46:24

Spark-aggregateByKey 类型不匹配错误

问题描述

1 个解决方案

解决方案1 1 2019-12-20 19:46:24

解决方案1
1 2019-12-20 19:46:24