简体   繁体   中英

Access Azure Blob storage account from azure data factory

I have a folder with list of files in my storage account and having been trying to delete one of the files using pipeline. In-order to get that done I have used "Web" in pipeline, copied the blob storage url and access keys.

Tired using the access keys directly under Headers|Authorization. Also tried the concept of Shared Keys at https://docs.microsoft.com/en-us/azure/storage/common/storage-rest-api-auth#creating-the-authorization-header

Even tried getting this work with curl, but it returned an Authentication Error every time I tried to run

# List the blobs in an Azure storage container.

echo "usage: ${0##*/} <storage-account-name> <container-name> <access-key>"

storage_account="$1"
container_name="$2"
access_key="$3"

blob_store_url="blob.core.windows.net"
authorization="SharedKey"

request_method="DELETE"
request_date=$(TZ=GMT LC_ALL=en_US.utf8 date "+%a, %d %h %Y %H:%M:%S %Z")
#request_date="Mon, 18 Apr 2016 05:16:09 GMT"
storage_service_version="2018-03-28"

# HTTP Request headers
x_ms_date_h="x-ms-date:$request_date"
x_ms_version_h="x-ms-version:$storage_service_version"

# Build the signature string
canonicalized_headers="${x_ms_date_h}\n${x_ms_version_h}"
canonicalized_resource="/${storage_account}/${container_name}"

string_to_sign="${request_method}\n\n\n\n\n\n\n\n\n\n\n\n${canonicalized_headers}\n${canonicalized_resource}\ncomp:list\nrestype:container"


# Decode the Base64 encoded access key, convert to Hex.
decoded_hex_key="$(echo -n $access_key | base64 -d -w0 | xxd -p -c256)"


# Create the HMAC signature for the Authorization header
signature=$(printf "$string_to_sign" | openssl dgst -sha256 -mac HMAC -macopt "hexkey:$decoded_hex_key" -binary |  base64 -w0)

authorization_header="Authorization: $authorization $storage_account:$signature"

curl \
  -H "$x_ms_date_h" \
  -H "$x_ms_version_h" \
  -H "$authorization_header" \
  -H "Content-Length: 0"\
  -X DELETE  "https://${storage_account}.${blob_store_url}/${container_name}/myfile.csv_123"

The curl command returns an error:

<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. RequestId:XX Time:2018-08-09T10:09:41.3394688Z</Message><AuthenticationErrorDetail>The MAC signature found in the HTTP request 'xxx' is not the same as any computed signature. Server used following string to sign: 'DELETE

You cannot authorize directly from the Data Factory to the storage account API. I suggest that you use an Logic App. The Logic App has built in support for Blob store: https://docs.microsoft.com/en-us/azure/connectors/connectors-create-api-azureblobstorage

You can call the Logic App from the Data Factory Web Activity. Using the body of the Data Factory request you can pass variables to the Logic app like the blob path.

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.Rest;
using Microsoft.Azure.Management.ResourceManager;
using Microsoft.Azure.Management.DataFactory;
using Microsoft.Azure.Management.DataFactory.Models;
using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Microsoft.WindowsAzure.Storage;

namespace ClearLanding
{
    class Program
    {
        static void Main(string[] args)
        {
            CloudStorageAccount backupStorageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=yyy;AccountKey=xxx;EndpointSuffix=core.windows.net");
            var backupBlobClient = backupStorageAccount.CreateCloudBlobClient();
            var backupContainer = backupBlobClient.GetContainerReference("landing");
            var tgtBlobClient = backupStorageAccount.CreateCloudBlobClient();
            var tgtContainer = tgtBlobClient.GetContainerReference("backup");
            string[] folderNames = args[0].Split(new char[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
            foreach (string folderName in folderNames)
            {
                var list = backupContainer.ListBlobs(prefix: folderName + "/", useFlatBlobListing: false);
                foreach (Microsoft.WindowsAzure.Storage.Blob.IListBlobItem item in list)
                {
                    if (item.GetType() == typeof(Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob))
                    {
                        Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob blob = (Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob)item;
                        if (!blob.Name.ToUpper().Contains("DO_NOT_DEL"))
                        {
                            var tgtBlob = tgtContainer.GetBlockBlobReference(blob.Name + "_" + DateTime.Now.ToString("yyyyMMddHHmmss"));
                            tgtBlob.StartCopy(blob);
                            blob.Delete();

                        }
                    }
                }
            }
        }
    }
}

I tried resolving this by compiling the above code and referencing it using custom activity in C# pipeline. The code snippet above transfers files from landing folder to a backup folder and deletes the file from landing

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM