简体   繁体   English

将双RDD保存到文件-Scala

[英]Saving double RDD into file - Scala

I am trying to save in a file a double RDD, what I mean by a double RDD is that I have this variable: 我正在尝试在文件中保存一个双RDD,一个双RDD的意思是我有这个变量:

res: org.apache.spark.rdd.RDD[org.apache.spark.rdd.RDD[((String,String), Int)]] = MapPartitionsRDD[19] 

I tried to store it with 我试图用

res.saveAsTextFile(path)

But it doesn't work, an exception is launched because Spark does not support nested RDD here is a sample of the code: 但这是行不通的,因为Spark不支持嵌套的RDD,所以启动了一个异常,这里是代码示例:

val res = Listword.map { x =>
Listword.map { y =>
  ((x._1, y._1), x._2 + y._2)
}
}
res.saveAsTextFile("C:/Users/Administrator/Documents/spark/spark-1.6.0-bin-hadoop2.6")

Spark does not allow nested RDDs. Spark不允许嵌套的RDD。 In your specific case, you can use cartesian : 在您的特定情况下,可以使用cartesian

ListWord.cartesian(ListWord).map { case (x, y) =>
  ((x._1, y._1), x._2 + y._2)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM