简体繁体 English

如何在 Azure 数据块中自动化长时间运行的代码和保存数据？

[英]How to automate long-running code and saving data in Azure databricks?

原文 2022-01-18 15:19:36 7 1 azure/ azure-blob-storage/ databricks/ azure-databricks

I am using the %run feature in Azure databricks to execute many notebooks in sequence from a command notebook.我正在使用 Azure 数据块中的 %run 功能从命令笔记本按顺序执行许多笔记本。 One notebook has a long computation on a dataset (~ 5 hrs) and I want to save the output of this.一个笔记本对数据集的计算时间很长（约 5 小时），我想保存其中的 output。 I tried including the save step at the end of the long-running notebook, but the save times out (see error below).我尝试在长时间运行的笔记本末尾包含保存步骤，但保存超时（请参阅下面的错误）。 I'm only seeing this error when the long-running notebook takes 2+ hrs to run.我只在长时间运行的笔记本需要 2 小时以上才能运行时看到此错误。 Is there any way I can automate this?有什么办法可以自动化吗？

I'm able to pass the data I want back through the %run feature in the command notebook and save the data there, but I have to run the save manually after the long-running notebook, otherwise I get the same authentication timeout error.我可以通过命令笔记本中的 %run 功能将我想要的数据传回并将数据保存在那里，但我必须在长时间运行的笔记本之后手动运行保存，否则我会收到相同的身份验证超时错误。 I'd like to be able to have one notebook where I only need to click "run all".我希望能够拥有一个只需要单击“全部运行”的笔记本。

1 个解决方案

I find it is better to break up long notebooks into smaller ones and use the multi-task job scheduler to help run them in order.我发现最好将长笔记本分解成更小的笔记本并使用多任务作业调度程序来帮助按顺序运行它们。

如何在 DevOps 管道中为长时间运行的 Azure 函数使用回调？ - How to use callbacks for long-running Azure Functions in a DevOps pipeline?

如何使用 Azure Monitor 或 ADF 本身触发 Azure 数据工厂 V2 中长时间运行的进程的警报通知？ - How to trigger an alert notification of a long-running process in Azure Data Factory V2 using either Azure Monitor or ADF itself?

如何在 Azure Databricks 笔记本中调试长时间运行的 python 命令？ - how to debug long running python commands in Azure Databricks notebook?

使用 Azure Functions 和 Azure 存储处理长时间运行的任务 - Handling long-running tasks using Azure Functions and Azure Storage

Azure Blob存储：如何以长期运行和可恢复的方式枚举Blob？ - Azure Blob Storage: How to enumerate blobs in long-running and recoverable fashion?

如何在 Azure Databricks Repos 中自动提交？ - How to automate commit in Azure Databricks Repos?

Azure长期运行的外部Web服务调用 - Azure Long-running external web service calls

Azure WebJobs - 托管在哪里？它们作为长期运行的过程是否安全？ - Azure WebJobs - where hosted? Are they safe as long-running processes?

Azure中使用Node.js的长时间间隔 - Long-running intervals in Azure with Node.js

Azure 管道是否设计用于支持长时间运行的作业（运行数天，可能数周）？ - Are Azure Pipelines designed to support long-running jobs (running for days, potentially weeks)?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 DevOps 管道中为长时间运行的 Azure 函数使用回调？ - How to use callbacks for long-running Azure Functions in a DevOps pipeline? 如何使用 Azure Monitor 或 ADF 本身触发 Azure 数据工厂 V2 中长时间运行的进程的警报通知？ - How to trigger an alert notification of a long-running process in Azure Data Factory V2 using either Azure Monitor or ADF itself? 如何在 Azure Databricks 笔记本中调试长时间运行的 python 命令？ - how to debug long running python commands in Azure Databricks notebook? 使用 Azure Functions 和 Azure 存储处理长时间运行的任务 - Handling long-running tasks using Azure Functions and Azure Storage Azure Blob存储：如何以长期运行和可恢复的方式枚举Blob？ - Azure Blob Storage: How to enumerate blobs in long-running and recoverable fashion? 如何在 Azure Databricks Repos 中自动提交？ - How to automate commit in Azure Databricks Repos? Azure长期运行的外部Web服务调用 - Azure Long-running external web service calls Azure WebJobs - 托管在哪里？它们作为长期运行的过程是否安全？ - Azure WebJobs - where hosted? Are they safe as long-running processes? Azure中使用Node.js的长时间间隔 - Long-running intervals in Azure with Node.js Azure 管道是否设计用于支持长时间运行的作业（运行数天，可能数周）？ - Are Azure Pipelines designed to support long-running jobs (running for days, potentially weeks)?

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM