简体   繁体   中英

Azure - Trigger Databricks notebook for each new blob in Storage container

I am implementing one testing solution as:

I have created an Azure databricks notebook in Python. This notebook is performing following tasks (for testing)-

  1. Read blob file from Storage account in a Pyspark dataframe.
  2. Doing some transformation and analysis on it.
  3. Creating CSV with transformed data and storing in a different container.
  4. Move original read CSV to different archive container (so that it should not be picked up in next execution).

*Above steps can be done in different Notebooks also.

Now, I need this Notebook to be triggered for each new Blob in a container. I will implement following orchestration-

New blob in Container -> event to EventGrid topic-> trigger Datafactory pipeline -> execute Databricks Notebook.

We can pass filename as parameter from ADF pipeline to Databricks notebook.

Looking for some other ways to do the orchestration flow. If above seems correct and more suitable, please mark as answered.

New blob in Container -> event to EventGrid topic-> trigger Datafactory pipeline -> execute Databricks Notebook.

We can pass filename as parameter from ADF pipeline to Databricks notebook.

Looking for some other ways to do the orchestration flow. If above seems correct and more suitable, please mark as answered.

You can use this method. Of course, you can also follow this path:

New blob in Container -> Use built-in event trigger to trigger Datafactory pipeline -> execute Databricks Notebook .

I don't think you need to introduce the event grid, because Data Factory comes with triggers for creating events based on blobs.

I got 2 support comments for what I am following for orchestration. // New blob in Container -> event to EventGrid topic-> trigger Datafactory pipeline -> execute Databricks Notebook. //

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM