简体   繁体   English

将 dataframe 保存为 csv 文件(在 databricks 中处理)并将其上传到 azure datalake blob 存储

[英]Saving a dataframe as a csv file(processed in databricks) and uploading it to azure datalake blob storage

I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the issue is appreciated.Thanks. I had a csv file stored in azure datalake storage which i imported in databricks by mounting the datalake account in my databricks cluster, After doing preProcessing i wanted to store the csv back in the same datalakegen2 (blobstorage) account.Any leads and help on the问题表示赞赏。谢谢。

Just write a file in the same mounted location.只需在相同的安装位置写入文件。 See example from here: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook请参阅此处的示例: https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#example-notebook

df.write.json("abfss://<file_system>@<storage-account-name>.dfs.core.windows.net/iot_devices.json")

Just save it directly to Blob storage.只需将其直接保存到 Blob 存储即可。

df.write.
    format("com.databricks.spark.csv").
    option("header", "true").
    save("myfile.csv")

There is no point in saving the file locally and then pushing it into the Blob.在本地保存文件然后将其推送到 Blob 中没有意义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Databricks 将数据帧写入 Azure Blob 存储会生成一个空文件 - Writing dataframe to Azure Blob Storage using Databricks generates an empty file Azure Blob将文件上传到存储 - Azure Blob uploading file to storage Azure Databricks在Blob存储上打开文件的问题 - Problems with Azure Databricks opening a file on the Blob Storage 将 Powershell Object CSV 字符串作为文件保存到 Z3A580F142203677F1F0BC30898F - Saving Powershell Object CSV String as a file to Azure Blob Storage 将文件保存到 Azure Blob 存储 - Saving a file into Azure Blob storage 使用 Databricks 将文件从 Azure Blob 存储上传到 SFTP 位置? - Uploading files from Azure Blob Storage to SFTP location using Databricks? Azure 数据块 - 无法使用来自数据湖存储 gen2 服务的 Spark 作业读取 .csv 文件 - Azure databricks - not able to read .csv files using spark jobs from datalake storage gen2 service Databricks上的PySpark:读取从Azure Blob存储复制的CSV文件会导致java.io.FileNotFoundException - PySpark on Databricks: Reading a CSV file copied from the Azure Blob Storage results in java.io.FileNotFoundException 将 Pandas 或 Pyspark dataframe 从 Databricks 保存到 Azure Blob 存储 - Save Pandas or Pyspark dataframe from Databricks to Azure Blob Storage 将视频文件上传到NODEJS中的AZURE BLOB STORAGE - Uploading Video File to AZURE BLOB STORAGE in NODEJS
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM