简体   繁体   English

在数据块中如何自动化笔记本运行

[英]In databricks how to automate notebook runs

I have multiple datasets that are updated inconsistently in databricks: datasets database.A , database.B , database.C .我有多个在数据块中更新不一致的数据集:数据集database.Adatabase.Bdatabase.C

  • database.A : is updated the first of every month (ie 1/1/2022, 2/1/2022, etc.), but sometimes has midsession updates (ie 3/14/2022, 4/12/2022, etc.) database.A :每个月的第一天更新(即 1/1/2022、2/1/2022 等),但有时会在会议期间更新(即 3/14/2022、4/12/2022 等) )
  • database.B : is updated the fifth of every month database.B :每月五号更新
  • database.C : is updated the first of every quarter (ie 1/1/2022, 4/1/2022, etc.), but sometimes has a midsession update (ie 5/1/2022, etc.) database.C :每个季度的第一个更新(即 2022 年 1 月 1 日、2022 年 4 月 1 日等),但有时会在会议期间更新(即 2022 年 5 月 1 日等)

My goal is to create a notebook that runs processes when the data is updated in any of these datasets.我的目标是创建一个笔记本,当数据在这些数据集中的任何一个中更新时运行进程。 For example:例如:

data.updated.A <- some_code_or_function(database.A)
data.updated.B <- some_code_or_function(database.B)
data.updated.C <- some_code_or_function(database.C)

case when data.updated.A = TRUE or data.updated.B = TRUE or data.updated.C = TRUE then run_notebook else do_nothing_and_send_signal_1_day_from_now

Any ideas?有任何想法吗? Full disclosure, I am relatively new to databricks so I may not know if I need to switch from SQL to scala, python, or R and am fully willing to.完全披露,我对数据块比较陌生,所以我可能不知道我是否需要从 SQL 切换到 scala、python 或 R,我完全愿意。 Should I consider another tactic besides scheduled processes?Thanks.除了计划流程之外,我是否应该考虑另一种策略?谢谢。

You can run the notebook as a job and run it based on corn: https://docs.databricks.com/jobs.html#create-a-job可以把notebook当作job运行,基于corn运行: https://docs.databricks.com/jobs.html#create-a-job

If you are deploying your notebooks using Terraform you can this module that I wrote: https://github.com/tomarv2/terraform-databricks-workspace-management如果您使用 Terraform 部署您的笔记本,您可以使用我写的这个模块: https://github.com/tomarv2/terraform-databricks-workspace-management

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM