[英]Saving a dataframe as a csv file(processed in databricks) and uploading it to azure datalake blob storage
I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the issue is appreciated.Thanks. I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the问题表示赞赏。谢谢。
Just write a file in the same mounted location.只需在相同的安装位置写入文件。 See example from here: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook请参阅此处的示例: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook
df.write.json("abfss://<file_system>@<storage-account-name>.dfs.core.windows.net/iot_devices.json")
Just save it directly to Blob storage.只需将其直接保存到 Blob 存储即可。
df.write.
format("com.databricks.spark.csv").
option("header", "true").
save("myfile.csv")
There is no point in saving the file locally and then pushing it into the Blob.在本地保存文件然后将其推送到 Blob 中没有意义。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.