将 dataframe 保存为 csv 文件（在 databricks 中处理）并将其上传到 azure datalake blob 存储

Question

I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the issue is appreciated.Thanks. I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the问题表示赞赏。谢谢。

Answer 1

Just write a file in the same mounted location.只需在相同的安装位置写入文件。 See example from here: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook请参阅此处的示例： https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook

df.write.json("abfss://<file_system>@<storage-account-name>.dfs.core.windows.net/iot_devices.json")

Answer 2

Just save it directly to Blob storage.只需将其直接保存到 Blob 存储即可。

df.write.
    format("com.databricks.spark.csv").
    option("header", "true").
    save("myfile.csv")

There is no point in saving the file locally and then pushing it into the Blob.在本地保存文件然后将其推送到 Blob 中没有意义。

将 dataframe 保存为 csv 文件（在 databricks 中处理）并将其上传到 azure datalake blob 存储

问题描述

2 个解决方案

解决方案1
0 2019-09-27 10:36:21

解决方案2
0 2019-10-02 02:44:20

将 dataframe 保存为 csv 文件（在 databricks 中处理）并将其上传到 azure datalake blob 存储

问题描述

2 个解决方案

解决方案1 0 2019-09-27 10:36:21

解决方案2 0 2019-10-02 02:44:20

解决方案1
0 2019-09-27 10:36:21

解决方案2
0 2019-10-02 02:44:20