is it transform operation make a single RDD in Dstream

Question

when I am using spark streaming ,I don't reallyunderstand transform operation，here is my code:

val conf = new SparkConf().setAppName("streaming").setMaster("local[4]")
val ssc = new StreamingContext(conf, Seconds(1))
val mDstream = 
  ssc
   .socketTextStream(args(0), 9999).flatMap(x => x.split(" "))
   .map((_, 1))
   .reduceByKeyAndWindow((a: Int, b: Int) => (a + b), Seconds(10), Seconds(3))
   .transform(rdd => {
      rdd.sortBy(_._2, false)
    })

I want to Know how many RDDs in the mDstream? appreciate that!

Answer 1

transform is a method which runs on the driver side, that is how it is able to take in an RDD as its input parameter. Note that the sort will still run in parallel foreach partition inside the RDD . There will be a single RDD in a single job running your streaming job.

is it transform operation make a single RDD in Dstream

Question

1 answers

solution1
1 2017-06-20 09:48:45

is it transform operation make a single RDD in Dstream

Question

1 answers

solution1 1 2017-06-20 09:48:45

solution1
1 2017-06-20 09:48:45