简体   繁体   English

如何处理或架构Azure数据湖存储中的增量数据提取?

[英]How to Handle or Architecture, incremental data ingestion in Azure data lake Store?

I've Two Custom code dll, for Image related to IP Cams. 我有两个用于与IP摄像机相关的图像的自定义代码dll。

dll-One : Extract image from IP cams and can be stored it to Azure data lake Store. dll-One :从IP摄像机提取图像,并将其存储到Azure数据湖存储中。

Like :

  • /adls/clinic1/patientimages / adls / clinic1 / patientimages
  • /adls/clinic2/patientimages / adls / clinic2 / patientimages

dll-two : Use those image and extract information from it and load data into RDBMS tables. dll-two :使用这些图像并从中提取信息,并将数据加载到RDBMS表中。

So for instance in RDBMS ,say there are entities dimpatient, dimclinic and factpatientVisit. 因此,例如,在RDBMS中,假设存在实体暗患者,暗诊所和事实患者访问。

For start, a one time data can be exported to defined location in Azure data lake store. 首先,可以将一次性数据导出到Azure数据湖存储中的定义位置。

Like: 喜欢:

  • /adls/dimpatient / adls / dim Patient
  • /adls/dimclinic / adls / dimclinic
  • /adls/factpatientVisit / adls / fact PatientVisit

Question : How to push incremental data in same file or how we can handle this incremental load in Azure data Analytics? 问题:如何在同一文件中推送增量数据,或者如何处理Azure数据分析中的增量负载?

This like implementing Warehouse in Azure Data Analytics. 这就像在Azure数据分析中实现仓库一样。

Note : Azure SQL db or any other storage offered by Azure is not want to. 注意 :不想使用Azure SQL数据库或Azure提供的任何其他存储。 I mean why to spend in other Azure Services if one type of storage has capabilities to hold all types of data. 我的意思是,如果一种类型的存储具有保存所有类型的数据的功能,为什么要花其他的Azure服务。

adls is name of my ADLS storage. adls是我的ADLS存储的名称。

I am not sure I completely understand your question, but you can organize your data files in Azure Data Lake Store or your rows in partitioned U-SQL tables along a time dimension, so you can add new partitions/files for each increment. 我不确定我是否完全理解您的问题,但是您可以沿时间维度组织Azure Data Lake Store中的数据文件或分区的U-SQL表中的行,以便可以为每个增量添加新的分区/文件。 In general, we recommend that such increments are of substantial sizes though to preserve the ability to scale. 通常,我们建议这种增量应有足够的大小,但要保留扩展的能力。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 具有文件分区的Azure Data Lake增量加载 - Azure Data Lake incremental load with file partition 如何在Azure Data Lake Store上压缩文件 - How to compress files on azure data lake store 如何在 DBFS 上挂载 Azure 数据湖存储 - How to mount Azure Data Lake Store on DBFS Azure Data Lake Store基准测试 - Azure Data Lake Store Benchmarks 如何使用 Azure Data Lake Storage Gen2 和 Azure Data factory V2 执行基于事件的数据摄取? - How to perform Event based data ingestion using Azure Data Lake Storage Gen2 and Azure Data factory V2? Azure Data Lake:向Azure Data Lake Store的请求未经授权 - Azure Data Lake : The request to Azure Data Lake Store was unauthorized 将Azure CDN与Azure Data Lake存储集成 - Integrate Azure CDN with Azure Data Lake store Azure Data Lake Store 上的公共数据集与 Data Lake Analytics 一起使用 - Public Datasets on Azure Data Lake Store to use with Data Lake Analytics 如何在我的 Azure Data Lake Store 帐户上启用 AzureRmDataLakeStoreKeyVault? - How to Enable-AzureRmDataLakeStoreKeyVault on my Azure Data Lake Store account? 最适合将JSON从API保存到Data Lake Store的Azure体系结构? - Azure architecture best suited to save JSON from API to a Data Lake Store?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM