简体   繁体   中英

Saving a dataframe as a csv file(processed in databricks) and uploading it to azure datalake blob storage

I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the issue is appreciated.Thanks.

Just write a file in the same mounted location. See example from here: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook

df.write.json("abfss://<file_system>@<storage-account-name>.dfs.core.windows.net/iot_devices.json")

Just save it directly to Blob storage.

df.write.
    format("com.databricks.spark.csv").
    option("header", "true").
    save("myfile.csv")

There is no point in saving the file locally and then pushing it into the Blob.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM