简体   繁体   中英

Saving RDD's in for loop in scala

I have a for loop in which i end up with a RDD for every iteration, which i want to save for later use. What is the most efficient way to save and access these RDDs?

Thanks for the help in advance!

Samplecode, without RDD and spark specifics:

scala> val res = (for (i <- (1 to 10);
     |  j=2*i;
     |  k= s"i: $i j: $j") yield k)
res: scala.collection.immutable.IndexedSeq[String] = Vector(i: 1 j: 2, i: 2 j: 4, i: 3 j: 6, i: 4 j: 8, i: 5 j: 10, i: 6 j: 12, i: 7 j: 14, i: 8 j: 16, i: 9 j: 18, i: 10 j: 20)

scala> res(0)
res201: String = i: 1 j: 2

scala> res(1)
res202: String = i: 2 j: 4

So just yield your RDDs, and collect them in a Seq for universal later usage.

You may as well yield multiple vals

yield (i, j, k)

and deconstruct the tupleN later, filter it, group, and so on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM