简体   繁体   English

最适合将JSON从API保存到Data Lake Store的Azure体系结构?

[英]Azure architecture best suited to save JSON from API to a Data Lake Store?

I am looking forward to build an endpoint capable of receiving JSON objects and saving them into ADLS. 我期待构建一个能够接收JSON对象并将其保存到ADLS的终结点。 So far I have tried several different combinations using Functions, Event Hubs, and Stream Analytics. 到目前为止,我已经尝试使用Function,Event Hubs和Stream Analytics进行几种不同的组合。 The problem is: no solution so far seems ideal. 问题是:到目前为止,没有解决方案是理想的。

TL;DR In my scenario, I have a few set of users that will send me JSON data through an API, and I need to save it inside ADLS, separated by user. TL; DR在我的场景中,我有几组用户将通过API向我发送JSON数据,我需要将其保存在ADLS中,并按用户分开。 What is the best way of doing so? 最好的方法是什么?

Could anyone shed me some light? 谁能给我一些启示? Thanks in advance. 提前致谢。

WARNING: LONG TEXT AHEAD 警告:长文本

Let me explain my findings so far: 让我解释一下到目前为止的发现:

Functions 职能

Advantages 好处

  1. single solution approach - solving the scenario with a single service 单一解决方案方法-使用单一服务解决方案
  2. built-in authorization 内置授权
  3. organization - saving user's files to separate folders inside ADLS 组织-将用户文件保存到ADLS内的单独文件夹中
  4. HTTP endpoint - to send data only a POST is required HTTP端点-仅发送POST即可发送数据
  5. cheap & pay-as-you-go - charged per request 便宜随用随付-按要求收费

Disadvantages 缺点

  1. bindings & dependencies - Functions doesn't have ADLS bindings. 绑定和依赖关系-函数没有ADLS绑定。 To authorize and use ADLS, I need to install extra dependencies and manually manage its credentials. 要授权和使用ADLS,我需要安装其他依赖项并手动管理其凭据。 I was only able to do it with C#, but haven't tested with other languages. 我只能用C#做到这一点,但是还没有用其他语言进行过测试。 May also be a drawback, although I can't confirm. 尽管我无法确认,但这可能也是一个缺点。
  2. File management - saving 1 file per request is not suggested by ADLS. 文件管理-ADLS建议不要为每个请求保存1个文件。 The alternative would be to append to files and manage its size. 另一种选择是将附加到文件并管理其大小。 This means more code compared to the other solutions. 与其他解决方案相比,这意味着更多的代码。

Event Hub 活动中心

Advantages 好处

  1. no code at all - all I need is enabling data capture 完全没有代码-我需要的只是启用数据捕获

Disadvantages 缺点

  1. one event hub per user - the only way of separating data inside ADLS through event hub's capture capability requires using one event hub per user 每个用户一个事件中心-通过事件中心的捕获功能在ADLS内部分离数据的唯一方法是每个用户使用一个事件中心
  2. price - capturing one-event-hub-per-user increases the prices drastically 价格-每用户捕获一个事件中心将大大提高价格
  3. authorization - sending events are not as trivial as doing a POST 授权-发送事件不像执行POST那样简单

Functions + Event Hub 功能+事件中心

Using Event Hub with Functions mitigate Functions disadvantages, but have the same drawbacks (except auth) of Event Hub 将事件中心与函数一起使用可减轻函数的缺点,但具有与事件中心相同的缺点(auth除外)

Functions + Event Hub + Stream Analytics 功能+事件中心+流分析

Although I would be able to have a single event hub without capture, using Stream Analytics SQL as a filter to direct each user's data to its specific folder, it would be a limiting factor. 尽管我可以拥有一个不带捕获功能的事件中心,但使用Stream Analytics SQL作为筛选器将每个用户的数据定向到其特定文件夹,这将是一个限制因素。 I have tried it and it gets slower as the SQL gets bigger. 我已经尝试过了,随着SQL变大,它变得越来越慢。

IoT Hub 物联网中心

IoT Hub has routing, but it is not as dynamic as I require. IoT中心具有路由,但是它没有我所需要的动态。

Could anyone shed me some light? 谁能给我一些启示? Thanks in advance. 提前致谢。

I don't quite see the disadvantages of using only Azure Functions to write data to ADLS. 我不太明白仅使用Azure Functions将数据写入ADLS的弊端。

  • As long as you don't write lots of small files, writing 1 file per request should not really be an issue 只要您不写很多小文件,每个请求只写一个文件就不是问题
  • Use the .NET SDK should be pretty straightforward even without an existing binding 即使没有绑定,使用.NET SDK也应该非常简单
  • To solve the authentication piece: Use Managed Service Identity (MSI) and KeyVault to store your client secrets there. 解决身份验证问题的方法:使用托管服务身份(MSI)和KeyVault在其中存储您的客户端机密。 MSI support in the SDK is apparently on the roadmap and would then make this very easy indeed. SDK中对MSI的支持显然是在路线图上 ,然后确实可以很容易地做到这一点。
  • You save yourself the extra cost of an Event Hub and I don't see a real value add through it 您为自己节省了事件中心的额外费用,但我看不到它能带来真正的价值

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM