简体   繁体   中英

Python Azure function processing blob storage

I am trying to make a pipeline using Data Factory In MS Azure of processing data in blob storage and then running a python processing code/algorithm on the data and then sending it to another source.

My question here is, how can I do the same in Azure function apps? Or is there a better way to do it?

Thanks in advance.

Shyam

Yes you can do this. I have recently worked on creating a Data Factory (ADF) pipeline that pulls data from blob storage and transfers it to Snowflake. Since it is a good specific example, Snowflake has a number of connectors (including Python) that allow you to link up to it and run queries (which is how you create a stage in order to pull data from Azure). Here is the Snowflake documentation: https://docs.snowflake.net/manuals/user-guide/data-load-azure-create-stage.html .

You can follow the documentation here for creating an Azure Function in Python: https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-first-function-python and then replace whatever code you want to write in order to move your data elsewhere. It should be noted that this currently cannot be done in the portal(though Microsoft mentions that it is something they hope to fix soon). You can do this for any other endpoints that you want to move your data (this is just an example).

In my case, I used an ADF copy activity to pull data from a local file server into blob storage. From there, I created an Azure Function (Python) that connected to Snowflake and just used SnowSQL queries to create a file format, create an azure stage, and copy from the stage into a table (already created). Of course, for Snowflake you can just run all of those queries in a worksheet from the portal, but if you want all of your code stored in ADF (and you use Snowflake), then this is a nice way to do it:

  1. Imports:
import logging
import snowflake.connector 
import azure.functions as func
...
  1. Setup Snowflake connection and execute queries (just insert the code in the "main" function):
con = snowflake.connector.connect(
        user='user',
        password='password',
        account='account'
        )
cs = con.cursor()
try: 
    cs.execute("USE WAREHOUSE ...")
    cs.execute("USE DATABASE ...")
    ...
finally:
    cs.close()
con.close()

I created a Flask API and called my python code through that. And then put it in Azure as a web app and called the blob.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM