简体   繁体   English

使用Azure Data Factory将HTTP端点中的数据加载到Azure Data Lake中

[英]Data from HTTP endpoint to be loaded into Azure Data Lake using Azure Data Factory

I am trying to build a so called "modern data warehouse" using Azure services. 我正在尝试使用Azure服务构建一个所谓的“现代数据仓库”。

First step is to gather all the data in its native raw format into Azure Data Lake store. 第一步是将其本机原始格式的所有数据收集到Azure Data Lake存储中。 For some of the data sources we have no other choice than to use API for consuming the data. 对于某些数据源,我们别无选择,只能使用API​​来使用数据。 There's not much information when searching, therefore I am asking. 搜索时没有太多信息,所以我问。

Is it possible to define 2 Web Activities in my pipeline that will handle below scenario? 是否可以在我的管道中定义2个Web活动来处理下面的场景?

  1. Web1 activity gets an API URL generated from C# (Azure Function). Web1活动获取从C#(Azure功能)生成的API URL。 It returns data in JSON format and saves it to Web1.Output - this is working fine. 它以JSON格式返回数据并将其保存到Web1.Output-正常工作。
  2. Web2 activity consumes Web1.Output and saves it into Azure Data Lake as a plain txt file (PUT or POST) - this is needed. Web2活动使用Web1.Output并将其作为普通的txt文件(PUT或POST)保存到Azure Data Lake中 - 这是必需的。

Above scenario is achievable by using Copy activity, but then I am not able to pass dynamic URL generated by Azure Functions. 通过使用Copy活动可以实现上述方案,但是我无法传递Azure Functions生成的动态URL。 How do I save the JSON output to ADL? 如何将JSON输出保存到ADL? Is there any other way? 还有其他方法吗?

Thanks! 谢谢!

Since you are using blob storage as an intermediary, and want to consume the blob upon creation, you could take advantage of Event Triggers . 由于您使用blob存储作为中介,并且想要在创建时使用blob,因此您可以利用事件触发器 You can set up the Event trigger to run a pipeline containing Web2 activity. 您可以设置事件触发器以运行包含Web2活动的管道。 Which kicks off when the Web1 activity completes (separate pipeline). 当Web1活动完成时(单独的管道)启动。

By separating the two activities into separate pipelines, the workflow becomes asynchronous. 通过将两个活动分成单独的管道,工作流变为异步。 This means you will not need to wait for both activities to complete before doing the next URL. 这意味着在执行下一个URL之前,您无需等待两个活动完成。 There are many other benefits as well. 还有许多其他好处。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Azure 数据工厂动态添加 HTTP 端点以将数据加载到 azure 数据湖中,并且 REST api 已通过 cookie 身份验证 - How to Dynamically adding HTTP endpoint to load data into azure data lake by using Azure Data Factory and the REST api is cookie autheticated 使用Azure Data Factory将数据从SAP BW复制到Azure Data Lake Store - Copying Data from SAP BW to Azure Data Lake Store using Azure Data Factory 无法使用数据工厂管道将数据从 azure 数据湖 gen2 复制到 azure sql db - Cannot copy data from azure data lake gen2 to azure sql db using data factory pipeline 使用Azure Data Factory将数据从Data Lake Store(JSON文件)移动到Azure搜索 - Move data from Data Lake Store (JSON file ) to Azure Search using Azure Data Factory 使用 Azure 数据工厂将数据从 Azure Data Lake 复制到 SnowFlake,无需阶段 - Copy Data from Azure Data Lake to SnowFlake without stage using Azure Data Factory 从数据工厂授权 Azure Function 应用程序 Http 端点 - Authorising Azure Function App Http endpoint from Data Factory 如何使用 Azure 数据工厂将新文件或更新文件从 Azure 数据湖推送到文件夹 - How to push new or updated files from Azure Data lake to File folder using Azure Data Factory Azure 数据工厂:数据湖访问权限 - Azure Data Factory: Data Lake Access Permissions 使用数据工厂附加到 azure 数据湖中的文件 - appending to a file in azure data lake using data factory 通过HTTP GET从API终结点增量复制数据(增量数据)到Azure Data Factory上的Azure SQL DB - Copying data incrementally (delta data) from an API endpoint by HTTP GET to an Azure SQL DB on Azure Data Factory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM