简体   繁体   English

value reduceByKey不是org.apache.spark.rdd.RDD的成员

[英]value reduceByKey is not a member of org.apache.spark.rdd.RDD

It's very sad.My spark version is 2.1.1,Scala version is 2.11 这很伤心。我的火花版本是2.1.1,Scala版本是2.11

import org.apache.spark.SparkContext._
import com.mufu.wcsa.component.dimension.{DimensionKey, KeyTrait}
import com.mufu.wcsa.log.LogRecord
import org.apache.spark.rdd.RDD

object PV {

//
  def stat[C <: LogRecord,K <:DimensionKey](statTrait: KeyTrait[C ,K],logRecords: RDD[C]): RDD[(K,Int)] = {
    val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)

I got this error 我收到了这个错误

at 1502387780429
[ERROR] /Users/lemanli/work/project/newcma/wcsa/wcsa_my/wcsavistor/src/main/scala/com/mufu/wcsa/component/stat/PV.scala:25: error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(K, Int)]
[ERROR]     val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)

there is defined a trait 定义了一个特征

trait KeyTrait[C <: LogRecord,K <: DimensionKey]{
  def getKey(c:C):K
}

It is compiled,Thanks. 编译完毕,谢谢。

 def stat[C <: LogRecord,K <:DimensionKey : ClassTag : Ordering](statTrait: KeyTrait[C ,K],logRecords: RDD[C]): RDD[(K,Int)] = {
    val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)

Key need to override Ordering[T]. 关键需要覆盖Ordering [T]。

  object ClientStat extends KeyTrait[DetailLogRecord, ClientStat] {
      implicit val c

lientStatSorting = new Ordering[ClientStat] {
    override def compare(x: ClientStat, y: ClientStat): Int = x.key.compare(y.key)
  }

      def getKey(detailLogRecord: DetailLogRecord): ClientStat = new ClientStat(detailLogRecord)
    }

This comes from using a pair rdd function generically. 这来自于通常使用一对rdd函数。 The reduceByKey method is actually a method of the PairRDDFunctions class, which has an implicit conversion from RDD : reduceByKey方法实际上是PairRDDFunctions类的一个方法,它具有来自RDD的隐式转换:

implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
    (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V]

So it requires several implicit typeclasses. 所以它需要几个隐式类型类。 Normally when working with simple concrete types, those are already in scope. 通常在处理简单的混凝土类型时,这些已经在范围内。 But you should be able to amend your method to also require those same implicits: 但是您应该能够修改您的方法以同样需要相同的含义:

def stat[C <: LogRecord,K <:DimensionKey](statTrait: KeyTrait[C ,K],logRecords: RDD[C])(implicit kt: ClassTag[K], ord: Ordering[K])

Or using the newer syntax: 或者使用更新的语法:

def stat[C <: LogRecord,K <:DimensionKey : ClassTag : Ordering](statTrait: KeyTrait[C ,K],logRecords: RDD[C])

reduceByKey is a method that is only defined on RDDs of tuples, ie RDD[(K, V)] (K, V is just a convention to say that first is key second is value). reduceByKey是一种仅在元组的RDD[(K, V)]上定义的方法,即RDD[(K, V)] (K,V只是一个惯例,即首先是关键的第二个是值)。

Not sure from the example about what you are trying to achieve, but for sure you need to convert the values inside the RDD to tuples of two values. 从示例中不确定您要实现的目标,但是肯定需要将RDD中的值转换为两个值的元组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM