简体   繁体   English

当 blob 更改时触发 Azure 数据块

[英]Trigger Azure databricks when blob changes

I am parsing the files from Azure blob storage using spark in Azure databricks.我正在使用 Azure 数据块中的 spark 解析来自 Azure blob 存储的文件。 The blob is mounted as dbfs.该 blob 安装为 dbfs。 Right now I am doing it in a notebook, using hardcoded file name(dbfs file name).现在我正在笔记本中使用硬编码的文件名(dbfs 文件名)。 But I want to trigger the notebook with the new dbfs name whenever a new blob is created.但我想在创建新 blob 时使用新的 dbfs 名称触发笔记本。 I checked using Azure functions I can get a blob trigger.我使用 Azure 函数进行了检查,我可以获得一个 blob 触发器。 Can I start a databricks notebook/job from Azure functions?我可以从 Azure 函数开始数据块笔记本/作业吗? The operations on blob takes quite some time.对 blob 的操作需要相当长的时间。 Is it advisable to use azure functions in such cases.在这种情况下是否建议使用 azure 功能。 Or is there some other way to achieve this.或者有没有其他方法可以实现这一点。

As Parth Deb says, use azure datafactory will be easier for your requirement.正如 Parth Deb 所说,使用 azure 数据工厂将更容易满足您的要求。

You just need to create a trigger of your pipeline and then create a event trigger based on 'blob created' to trigger the databricks activity.您只需要创建管道的触发器,然后基于“创建的 blob”创建事件触发器即可触发数据块活动。 You just need to pass parameters.你只需要传递参数。

This is a built-in function of the factory, you can check the documentation:这是出厂内置的function,可以查看文档:

https://docs.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities https://docs.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities

https://docs.microsoft.com/en-us/azure/data-factory/transform-data-databricks-notebook https://docs.microsoft.com/en-us/azure/data-factory/transform-data-databricks-notebook

https://docs.microsoft.com/en-us/azure/data-factory/how-to-expression-language-functions https://docs.microsoft.com/en-us/azure/data-factory/how-to-expression-language-functions

You can look at the above document.你可以看看上面的文档。 In the end, you basically only need some mouse operations.最后,你基本上只需要一些鼠标操作。

I ended up using ADF.我最终使用了 ADF。 I created a new pipeline with Blob triggers that were triggered based on the file names.我使用基于文件名触发的 Blob 触发器创建了一个新管道。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM