简体   繁体   English

使用Azure Functions调用REST API并将结果保存在Azure Data Lake gen2中

[英]Using Azure Functions to call REST API and save results in Azure Data Lake gen2

I want to call a rest api and save the results as a csv or json file in Azure Data Lake Gen2. 我想调用rest api并将结果保存为Azure Data Lake Gen2中的csv或json文件。 Based on what I have read Azure Functions is the way to go. 基于我所读到的,Azure功能是最佳选择。

The webservice returns data like the following format: webservice返回如下格式的数据:

"ID","ProductName","Company"
"1","Apples","Alfreds futterkiste"
"2","Oranges","Alfreds futterkiste"
"3","Bananas","Alfreds futterkiste"
"4","Salad","Alfreds futterkiste"
 ...next rows

I have written a console app in C# which at the moment outputs the data to a console. 我在C#中编写了一个控制台应用程序,目前将数据输出到控制台。 The webservice uses pagination and returns 1000 rows (determined by the &num-parameter with a max of 1000). Web服务使用分页并返回1000行(由&num-parameter确定,最大值为1000)。 After the first request i can use the &next-parameter to fetch the next 1000 rows based on ID. 在第一个请求之后,我可以使用&next-parameter根据ID获取接下来的1000行。 For instance the url 例如网址

http://testWebservice123.com/Example.csv?auth=abc&number=1000&next=1000

will get me rows from ID 1001 to 2000. (the call of the API and the pagination in reality is a bit more complex and thus I cannot use for instance Azure Data Factory_v2 to do the load to Azure Data Lake - this is why I think i need Azure Functions - unless I have overlooked another servic??. So the following is just a demo to learn how to write to Azure Data Lake.) 将从ID 1001到2000获取行。(API的调用和实际中的分页有点复杂,因此我无法使用Azure Data Factory_v2来加载Azure Data Lake - 这就是为什么我认为我需要Azure功能 - 除非我忽略了另一个服务器。所以以下只是一个学习如何写入Azure Data Lake的演示。)

I have the following C#: 我有以下C#:

static void Main(string[] args)
    {


        string startUrl = "http://testWebservice123.com/Example.csv?auth=abc&number=1000";
        string url = "";
        string deltaRequestParameter = "";
        string lastLine;
        int numberOfLines = 0;

        do
        {
            url = startUrl + deltaRequestParameter;
            WebClient myWebClient = new WebClient();

            using (Stream myStream = myWebClient.OpenRead(url))
            {

                using (StreamReader sr = new StreamReader(myStream))
                {
                    numberOfLines = 0;
                    while (!sr.EndOfStream)
                    {
                        var row = sr.ReadLine();
                        var values = row.Split(',');

                        //do whatever with the rows by now - i.e. write to console
                        Console.WriteLine(values[0] + " " + values[1]); 

                        lastLine = values[0].Replace("\"", ""); //last line in the loop - get the last ID.
                        numberOfLines++;
                        deltaRequestParameter = "&next=" + lastLine;
                    }

                }

            }
        } while (numberOfLines == 1001); //since the header is returned each time the number of rows will be 1001 until we get to the last request


    }

I want to write the data to a csv-file to the data-lake in the most effective way. 我想以最有效的方式将数据写入csv文件到data-lake。 How would I rewrite the above code to work in Azure Function and save to a csv in Azure data lake gen2? 如何重写上述代码以在Azure Function中工作并保存到Azure数据湖gen2中的csv?

Here are the steps which you need to do for achieving the result: 以下是实现结果所需的步骤:

1) Create an azure function and trigger you can keep it HTTPTrigger/TimerTrigger, or as per your need. 1)创建一个azure函数并触发你可以保持HTTPTrigger / TimerTrigger,或根据你的需要。

2) I am assuming you have the code to call api in loop until it gives you desired result. 2)我假设你有代码在循环中调用api,直到它给你想要的结果。

3) Once you have the Data in memory , you have to write following code to write it in Azure data lake. 3)在内存中存储数据后,必须编写以下代码才能将其写入Azure数据湖。

Prerequisite for accessing ADLS using your c# code: 使用c#代码访问ADLS的先决条件:

1) Register an app in Azure AD 1)在Azure AD中注册应用程序

在此输入图像描述

在此输入图像描述

Grant permission in data lake store 在数据湖商店中授予权限

在此输入图像描述

在此输入图像描述

在此输入图像描述

在此输入图像描述

Below is the code for creating ADLS client. 以下是创建ADLS客户端的代码。

// ADLS connection 
                var adlCreds = GetCreds_SPI_SecretKey(tenantId, ADL_TOKEN_AUDIENCE, serviceAppIDADLS, servicePrincipalSecretADLS);
                var adlsClient = AdlsClient.CreateClient(adlsName, adlCreds);



private static ServiceClientCredentials GetCreds_SPI_SecretKey(string tenant,Uri tokenAudience,string clientId,string secretKey)
        {
            SynchronizationContext.SetSynchronizationContext(new SynchronizationContext());
            var serviceSettings = ActiveDirectoryServiceSettings.Azure;
            serviceSettings.TokenAudience = tokenAudience;
            var creds = ApplicationTokenProvider.LoginSilentAsync(tenant,clientId,secretKey,serviceSettings).GetAwaiter().GetResult();
            return creds;
        }

Finally write the implementation to save the file in Azure data lake 最后编写实现以将文件保存在Azure数据湖中

 const string delim = ",";
        static string adlsInputPath = ConfigurationManager.AppSettings.Get("AdlsInputPath");

public static void ProcessUserProfile(this SampleProfile, AdlsClient adlsClient, string fileNameExtension = "")
        {
            using (MemoryStream memStreamProfile = new MemoryStream())
            {
                using (TextWriter textWriter = new StreamWriter(memStreamProfile))
                {
                    string profile;
                    string header = Helper.GetHeader(delim, Entities.FBEnitities.Profile);
                    string fileName = adlsInputPath + fileNameExtension + "/profile.csv";
                    adlsClient.DataLakeFileHandler(textWriter, header, fileName);
                    profile = socialProfile.UserID                                                
                                    + delim + socialProfile.Profile.First_Name
                                    + delim + socialProfile.Profile.Last_Name
                                    + delim + socialProfile.Profile.Name
                                    + delim + socialProfile.Profile.Age_Range_Min
                                    + delim + socialProfile.Profile.Age_Range_Max
                                    + delim + socialProfile.Profile.Birthday
                                   ;

                    textWriter.WriteLine(profile);
                    textWriter.Flush();
                    memStreamProfile.Flush();
                    adlsClient.DataLakeUpdateHandler(fileName, memStreamProfile);
                }
            }
        }

Hope it helps. 希望能帮助到你。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 通过 Azure 函数中的 C# 将文件从一个 DataLake Gen2 复制到另一个 Data Lake Gen 2 - Copy file from one DataLake Gen2 to another Data Lake Gen 2 via C# in Azure Functions Azure Data Lake Gen2 - 如何使用 C# 将文件从文件夹移动到另一个文件夹 - Azure Data Lake Gen2 - How do I move files from folder to another folder using C# 如何创建文件或将文件上传到Azure Data Lake Storage Gen2 - How to create a file or upload a file to Azure Data Lake Storage Gen2 Data Lake Gen 2 的 Azure Blob 触发器函数 - Azure Blob Trigger Function for Data Lake Gen 2 从 azure 函数连接到 Azure 数据湖 Gen 2 - Connection to Azure data lake Gen 2 from azure function Read CSV From Azure Data lake storage Gen 1 in c# .net API - Read CSV From Azure Data lake storage Gen 1 in c# .net API Azure Data Lake Gen 2 - 如何选择加入“Azure Data Lake Storage 上的多协议访问” - Azure Data Lake Gen 2 - How to opt in to “Multi-protocol access on Azure Data Lake Storage” 使用服务主体从 Azure Function 连接到 Data Lake Gen 2 会引发 AuthorizationPermissionMismatch 错误 - Connecting to Data Lake Gen 2 from Azure Function using Service Principal is throwing AuthorizationPermissionMismatch error 如何在 C# 中使用服务主体(clientId 和 clientSecret)为 Azure Data Lake Store(Gen-2)创建 SAS 令牌? - How to create SAS token for Azure Data Lake Store (Gen-2) using service principals (clientId and clientSecret) in C#? 蔚蓝数据湖授权 - azure data lake authorization
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM