简体   繁体   中英

to_csv "No Such File or Directory" But the directory does exist - Databricks on ADLS

I've seen many iterations of this question but cannot seem to understand/fix this behavior.

I am on Azure Databricks working on DBR 10.4 LTS Spark 3.2.1 Scala 2.12 trying to write a single csv file to blob storage so that it can be dropped to an SFTP server. Could not use spark-sftp because I am on Scala 2.12 unfortunately and could not get the library to work.

Given this is a small dataframe, I am converting it to pandas and then attempting to_csv.

to_export = df.toPandas()

to_export.to_csv(pathToFile, index = False)

I get the error: [Errno 2] No such file or directory: '/dbfs/mnt/adls/Sandbox/user/project_name/testfile.csv

Based on the information in other threads, I create the directory with dbutils.fs.mkdirs("/dbfs/mnt/adls/Sandbox/user/project_name/") /n Out[40]: True

The response is true and the directory exists, yet I still get the same error. I'm convinced it is something obvious and I've been staring at it for too long to notice. Does anyone see what my error may be?

  • Python's pandas library recognizes the path only when it is in File API Format (since you are using mount). And dbutils.fs.mkdirs uses Spark API Format which is different from File API Format.

  • As you are creating the directory using dbutils.fs.mkdirs with path as /dbfs/mnt/adls/Sandbox/user/project_name/ , this path would be actually considered as dbfs:/dbfs/mnt/adls/Sandbox/user/project_name/ . Hence, the directory would be created within DBFS.

dbutils.fs.mkdirs('/dbfs/mnt/repro/Sandbox/user/project_name/')

在此处输入图像描述

  • So, you have to create the directory by modify the code to create directory to the following code:
dbutils.fs.mkdirs('/mnt/repro/Sandbox/user/project_name/')
#OR
#dbutils.fs.mkdirs('dbfs:/mnt/repro/Sandbox/user/project_name/')
  • Writing to the folder would now work without any issue.
pdf.to_csv('/dbfs/mnt/repro/Sandbox/user/project_name/testfile.csv', index=False)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM