简体   繁体   English

Build a pipeline in azure data factory to load Excel files, format content, transform in csv and send to azure sql DB

[英]Build a pipeline in azure data factory to load Excel files, format content, transform in csv and send to azure sql DB

I'm approaching to Azure environment and watching tutorials/reading documents, but I'm trying to figure out how to setup a flow that enables the process that I will describe hereunder.我正在接近 Azure 环境并观看教程/阅读文档,但我试图弄清楚如何设置一个流程来启用我将在下面描述的过程。 The starting point are reports in.xlsx format produced monthly by Mktg Dept: the requirements are to bring them in Azure SQL DB so that data can be stored and analysed.起点是 Mktg Dept 每月生成的.xlsx 格式的报告:要求将它们带入 Azure SQL DB 以便可以存储和分析数据。 Sofar I managed to put those files (previously manually converted in.csv format) in a BLOB storage and build an ADF pipeline that copy each file in a table on the SQL DB.到目前为止,我设法将这些文件(之前手动转换为.csv 格式)放入 BLOB 存储中,并构建了一个 ADF 管道,该管道将每个文件复制到 SQL DB 上的表中。 The problem is that as far as I understood with ADF it's not possible to directly manage xlsx files, and I'm wondering how to set up an automated procedure that enables the conversion from.xlsx to.csv and save them on BLOB storage.问题是,据我了解,使用 ADF 无法直接管理 xlsx 文件,我想知道如何设置一个自动化程序来实现从.xlsx 到.csv 的转换并将它们保存在 BLOB 存储中。 I was thinking about adding to the pipeline a python script/Databricks notebook to convert format, but I'm not sure this could be the best solution.我正在考虑将 python 脚本/Databricks 笔记本添加到管道中以转换格式,但我不确定这是否是最佳解决方案。 Any hint/reference to existing tutorial or resources would be very appreciated对现有教程或资源的任何提示/参考将不胜感激

I found a tutorial which uses Logic Apps to do the conversion.我找到了一个使用逻辑应用程序进行转换的教程

Datanovice indirectly suggested using a Custom activity to run either a C# or Python application to do the conversion for you. Datanovice 间接建议使用自定义活动来运行C#Python应用程序为您进行转换。

The least expensive solution would be to do the conversion before uploading to blob, like Datanovice said.最便宜的解决方案是在上传到 blob 之前进行转换,就像 Datanovice 说的那样。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Azure 数据工厂 V2 将 Excel 工作簿的多个工作表并行加载到 SQL 数据库中 - Bulk Load Multiple Worksheets of an Excel Workbook parallelly into SQL DB using Azure Data Factory V2 我在 blob 存储中获得连续的 blob 文件。 我必须加载 Databricks 并放入 Azure SQL DB。 用于编排此管道的数据工厂 - I'm getting continuous blob files in blob storage. I have to load in Databricks and put in Azure SQL DB. Data factory for orchestrating this pipeline Excel 到 Azure 数据工厂到 SQL - Excel into Azure Data Factory into SQL 从一个 Azure SQL DB 中提取数据,转换并加载到另一个 Azure Z97788282876C - Extract data from one Azure SQL DB, transform and load into another Azure SQL DB Azure 数据工厂管道 - Azure Data Factory pipeline 是否可以在 Azure 数据工厂的 Azure DevOps 中创建构建管道 - Is it possible to create build pipeline in Azure DevOps for Azure Data factory 如何直接在 Azure 数据工厂管道中转换 JSON 数据 - How to transform a JSON data directly in a Azure Data Factory pipeline 如何在源数据集(JSON)中删除重复项并将数据加载到Azure Data Factory中的Azure SQL DB中 - How to drop duplicates in source data set (JSON) and load data into azure SQL DB in azure data factory 如何在没有 Azure 数据工厂的情况下将 csv 文件从 blob 存储加载到 azure sql 数据库 - How to load csv file from blob storage to azure sql database without Azure Data Factory 在 Azure 数据工厂中读取 excel 文件 - In Azure Data Factory read excel files
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM