Saving a dataframe as a csv file(processed in databricks) and uploading it to azure datalake blob storage

Question

I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the issue is appreciated.Thanks.

Answer 1

Just write a file in the same mounted location. See example from here: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook

df.write.json("abfss://<file_system>@<storage-account-name>.dfs.core.windows.net/iot_devices.json")

Answer 2

Just save it directly to Blob storage.

df.write.
    format("com.databricks.spark.csv").
    option("header", "true").
    save("myfile.csv")

There is no point in saving the file locally and then pushing it into the Blob.

Saving a dataframe as a csv file(processed in databricks) and uploading it to azure datalake blob storage

Question

2 answers

solution1
0 2019-09-27 10:36:21

solution2
0 2019-10-02 02:44:20

Saving a dataframe as a csv file(processed in databricks) and uploading it to azure datalake blob storage

Question

2 answers

solution1 0 2019-09-27 10:36:21

solution2 0 2019-10-02 02:44:20

solution1
0 2019-09-27 10:36:21

solution2
0 2019-10-02 02:44:20