简体   繁体   中英

How to decide between Azure Data Lake vs Azure SQL vs Azure Data Lake Analytics vs Azure SQL VM?

I am new to Azure and hence trying to understand what services to use when and how.

At the moment, I have one excel file that has couple of tabs that require some transformation to create one excel file tab (inside the source file itself - say Tab "x"). The final tab "x" created is then being useful for creating one final excel file that is shared to various team.

At present, everything is done manually.

This needs to change and the excel file shared to team has to be automated. The source of the file is the excel file that has various tabs (excluding tab " x ") and the reporting tool will be SSRS with excel data being stored in cloud.

Keeping this scenario in mind, what is the best way to store excel data into cloud? The excel data will be stored in cloud on a monthly basis. I am confused as to whether to store data in Azure-SQL, Azure Data Lake Gen 2 or Azure Data Lake Analytics or Azure SQL VM?

Every month data can be fetched from Excel file and populate into Azure using azure data factory. But I am not sure what is the best way to store data in the cloud considering the fact that some ETL process is needed to generate data in format similar to tab " X ".

Your storage requirements are very minimal, so I would select Data Lake to store your documents. The alternative is Blob Storage, but I always prefer Data Lake because it works with Azure Active Directory.

In your scenario, drop it in the ADL, and use the ADL as the source in Azure Data Factory.

Edit:

Honestly, your original post is a little confusing. You have a RAW Excel document, you do some transformations on the RAW document, to generate an Excel Source document. This source document holds the final dataset that the dev team will use to build out SSRS reports. You need to make this dataset available to the teams so that they can connect to it to build the reports? My suggestion is to keep it simple and drop the final source dataset in Excel format, into blob or data lake storage and then ask the dev guys to pick it up from the location. If you are going the route of designing and maintaining a data pipeline (Blob > Data Factory > SQL, or CSV, TSV - then you are introducing unnecessary complications.

I think you can think about to using Azure SQL database.

Azure SQL database or SQL server support you import data from the excel( or csv) files. For more details and limits, please see: Import data from Excel to SQL Server or Azure SQL Database .

If your data have stored in Azure SQL database, you also can using EXCEL to get the data from Azure SQL database:

Connect Excel to a single database in Azure SQL Database and import data and create tables and charts based on values in the database. In this tutorial you will set up the connection between Excel and a database table, save the file that stores data and the connection information for Excel, and then create a pivot chart from the database values.

Reference: Import data from Excel to SQL Server or Azure SQL Database .

I think you don't need to store these excel files in Azure Data Lake .Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage. It's still a storage.

The more Azure resource you use, the more cost you need to pay .

If your excel file stored in you local computer, you can using Azure Data Factory to access these local files or with self host integration runtime .

Please reference: Copy data to or from a file system by using Azure Data Factory .

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM