简体   繁体   English

Scala / Spark减少类对象的RDD

[英]Scala/Spark reduce RDD of class objects

I need your help for my last step of a school project. 在学校项目的最后一步,我需要您的帮助。

val conf: SparkConf = new SparkConf() .setMaster("local[*]") .setAppName("AppName") .set("spark.driver.host", "localhost")
    val sc: SparkContext = new SparkContext(conf)
    var list_creature = new ListBuffer[creature]()
    list_creature += new creature("ska")
    list_creature(0).addspell("Heal")
    list_creature(0).addspell("Attaque")
    list_creature += new creature("moise")
    list_creature(1).addspell("Tank")
    list_creature(1).addspell("Defense")
    list_creature(1).addspell("Attaque")
    val rdd = sc.parallelize(list_creature)
    val y = rdd.map(e=>(e.name,e.Spells)).collect()
    val z = y.flatMap(x =>ListBuffer(x._2->x._1))
    val ze = z.flatMap(e =>e._1.flatMap(x => ListBuffer(x->e._2)))

i get this as a result, 我得到这个结果,

(Heal,ska)
(Attaque,ska)
(Tank,moise)
(Defense,moise)
(Attaque,moise)

So, i want to reduce this List[List[String]] to get List[String,List[string]] and the result will be : 因此,我想减少此List [List [String]]以获得List [String,List [string]],结果将是:

(Heal,(ska))
(Attaque,(ska,moise))
(Tank,(moise))
(Defense,(moise))

Thanks you're the best ... 谢谢你是最好的...

Not sure why you create a RDD then collect before all the major transformations. 不确定为什么要创建RDD然后在进行所有主要转换之前先进行collect Since you didn't provide definition of class Creature , I'm creating a placeholder class based on your question content as follows: 由于您未提供Creature类的定义,因此我将根据您的问题内容创建一个占位符类,如下所示:

class Creature(val name: String) extends Serializable {
  var spells: List[String] = List.empty[String]
  def addspell(spell: String): Unit = {
    spells ::= spell
  }
}

import scala.collection.mutable.ListBuffer

val list_creature = ListBuffer[Creature]()

list_creature += new Creature("ska")
list_creature(0).addspell("Heal")
list_creature(0).addspell("Attaque")

list_creature += new Creature("moise")
list_creature(1).addspell("Tank")
list_creature(1).addspell("Defense")
list_creature(1).addspell("Attaque")

val rdd = sc.parallelize(list_creature)

val reducedRDD = rdd.flatMap( c => c.spells.map(s => (s, List(c.name))) ).
  reduceByKey( _ ++ _ )

reducedRDD.collect
// res1: Array[(String, List[String])] = Array(
//   (Heal,List(ska)), (Defense,List(moise)), (Attaque,List(ska, moise)), (Tank,List(moise)
// ))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM