简体繁体中英

How to use Azure databricks to read and write excel data with multiple sheets from ADLS gen 2

原文 2021-10-27 17:19:29 8 1 python/ pyspark/ databricks/ azure-databricks

I want to implement the below logic in Azure databricks using pyspark. I have a below file which has multiple sheets in it. the file is present on adls gen 2. I want to read the data of all sheets into a different file and write the file to some location in adls gen 2 itself.

Note: All sheet has same schema ( Id, Name)

My final output file should have data from all the sheets. Also I need to create an additional column which stores the sheetName info

1 answers

You can use the following logic

Using Pandas to read multiple worksheets of the same workbook link
concat the multiple dataframes in Pandas and make it single data frame link
Convert the Panda dataframe into pyspark dataframe . link
Apply Business logic which you want to implement.

Read files from multiple folders from ADLS gen2 storage via databricks and create single target file

How to create directory in ADLS gen2 from pyspark databricks

Azure Databricks pyspark readstream reads non orc files from the mounted ADLS Gen2 input path

Read .nc files from Azure Datalake Gen2 in Azure Databricks

Read CSV from Azure Data Lake Storage Gen 2 to Pandas Dataframe | NO DATABRICKS

How can I register a specific version of a Delta Table in Azure Machine Learning Studio from Azure ADLS Gen 1?

Azure ADLS Gen2 File read using Python (without ADB)

Upload data to the Azure ADLS Gen2 from on-premise using Python or Java

How to retrieve .dcm image files from the ADLS gen2 using Azure Synapse and pySpark notebook?

How can I save files to ADLS from Azure Databricks python notebook?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Read files from multiple folders from ADLS gen2 storage via databricks and create single target file How to create directory in ADLS gen2 from pyspark databricks Azure Databricks pyspark readstream reads non orc files from the mounted ADLS Gen2 input path Read .nc files from Azure Datalake Gen2 in Azure Databricks Read CSV from Azure Data Lake Storage Gen 2 to Pandas Dataframe | NO DATABRICKS How can I register a specific version of a Delta Table in Azure Machine Learning Studio from Azure ADLS Gen 1? Azure ADLS Gen2 File read using Python (without ADB) Upload data to the Azure ADLS Gen2 from on-premise using Python or Java How to retrieve .dcm image files from the ADLS gen2 using Azure Synapse and pySpark notebook? How can I save files to ADLS from Azure Databricks python notebook?

Related Tags

How to use Azure databricks to read and write excel data with multiple sheets from ADLS gen 2

Question

1 answers

solution1 0 2021-10-27 18:20:41

solution1
0 2021-10-27 18:20:41