简体   繁体   English

当Spark Streaming将在Dstream上执行输出操作时

[英]When spark streaming will execute the output operation on Dstream

I was looking into the Spark Streaming programming Guide. 我正在研究《 Spark Streaming编程指南》。 I got one basic doubt like, when it will execute/compute the Dstream output operations. 我有一个基本的疑问,例如何时执行/计算Dstream输出操作。 For example(I got it from one example): 例如(我从一个例子中得到):

val ssc = new StreamingContext(conf, Seconds(1))
val lines = ssc.socketTextStream("localhost", 7777) 
lines.foreachRDD { rdd =>
  rdd.foreachPartition { partitionOfRecords =>
    val connection = createNewConnection()
    partitionOfRecords.foreach(record => connection.send(record))
    connection.close()
  }
}
// Start the computation
ssc.start()
// Wait for the computation to terminate
ssc.awaitTermination()

Will it do the operation at each batch-iterval here 1 second. 它会在这里进行batch-iterval 1秒的操作吗? Or it wait till termination. 或等到终止。

Will it do the operation at each batch-iterval here 1 second. 它会在这里进行每批1秒的操作吗? Or it wait till termination. 或等到终止。

It will read a batch at every 1 second interval and run the entire graph each time. 它将每隔1秒读取一次批处理,并每次运行整个图形。 In Spark terminology it's called executing a job at each interval. 在Spark术语中,这称为在每个时间间隔执行作业

The streaming job will only terminate once you single it to stop. 流作业仅在您将其停止后才终止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM