简体   繁体   English

如何在特定时间内运行 Spark 结构化流作业?

[英]How can I run a Spark structured streaming job for a certain time?

I want to schedule a Spark structured streaming job each day.我想每天安排一个 Spark 结构化流作业。 The Job itself must run for a certain number of hours and then stop.作业本身必须运行一定的小时数然后停止。 So, how can I specify such time duration?那么,我该如何指定这样的持续时间呢?

You need to schedule job with databricks scheduler once a day and then in the code add a timeout to your query:您需要每天使用 databricks 调度程序安排作业,然后在代码中为您的查询添加超时:

query = (df.writeStream...)

  query.awaitTermination(timeoutInSeconds)
  query.stop()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Spark Structured Streaming中处理已删除(或更新)的行? - How can I process deleted (or updated) rows in Spark Structured Streaming? 如何在Spark结构化流中向withWatermark添加超时功能 - How can I add timeout functionality to withWatermark in Spark Structured Streaming 如何从 Spark 结构化流作业中的每个微批次中的相同起始偏移量读取? - How do I read from same starting offset in each micro batch in spark structured streaming job? 如何使用 Airflow 重新启动失败的结构化流式 Spark 作业? - How to use Airflow to restart a failed structured streaming spark job? 如何将 Spark-Streaming 作业作为守护进程运行 - How to run Spark-Streaming job as a daemon 如何以编程方式运行Spark作业 - How can I run Spark job programmatically 增加 Spark Structured Streaming 作业的输出大小 - Increase the output size of Spark Structured Streaming job 如何在 pyspark 中的结构化流作业中运行 map 转换 - how to run map transformation in a structured streaming job in pyspark 如何使用本地 JAR 文件运行 Spark 结构化流 - How to run Spark structured streaming using local JAR files 如何使用Java在Spark结构化流中检查从Kafka获取数据? - How can I check I get data from Kafka in Spark-structured-streaming with Java?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM