简体   繁体   中英

Not able to copy file from DBFS to local desktop in Databricks

I want to save or copy my file from the dbfs to my desktop (local). I use this command but get an error:

dbutils.fs.cp('/dbfs/username/test.txt', 'C:\Users\username\Desktop') 
Error: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

When I lookup the dbutils.fs.help() for my case, I followed the instructions:

dbutils.fs provides utilities for working with FileSystems. Most methods in this package can take either a DBFS path (e.g., "/foo" or "dbfs:/foo"), or another FileSystem URI. For more info about a method, use dbutils.fs.help("methodName"). In notebooks, you can also use the %fs shorthand to access DBFS. The %fs shorthand maps straightforwardly onto dbutils calls. For example, "%fs head --maxBytes=10000 /file/path" translates into "dbutils.fs.head("/file/path", maxBytes = 10000)".

fsutils
cp(from: String, to: String, recurse: boolean = false): boolean -> Copies a file or directory, possibly across FileSystems

To download files from DBFS to local machine, you can follow the below steps.

Steps for installing and configuring Azure Databricks CLI using cmd:

Step1: Install Python, you'll need Python version 2.7.9 and above if you're using Python 2 or Python 3.6 and above if you're using Python 3.

Step2: Run pip install databricks-cli using the appropriate version of pip for your Python installation. If you are using Python 3, run pip3 install databricks-cli .

Step3: Setup authentication => To authenticate and access Databricks REST APIs, you use personal access tokens. Tokens are similar to passwords; you should treat them with care. Tokens expire and can be revoked.

  • Click the user profile icon User Profile in the upper right corner of your Azure Databricks workspace.

  • Click User Settings.

  • Go to the Access Tokens tab.

在此处输入图像描述

  • Click the Generate New Token button.
  • Optionally enter a description (comment) and expiration period.

在此处输入图像描述

  • Click the Generate button.
  • Make sure to "Copy " the generated token and store in a secure location.

Step4: Copy the URL of databricks host "https://centralus.azuredatabricks.net/" and token which created earlier step.

Step5: In cmd run "dbfs configure --token" as shown below:

dbfs configure --token
Databricks Host (should begin with https://): https://centralus.azuredatabricks.net
Token: dapi72026dsfsdfsh987hjfiu431

Step6: Successfully configured Databricks CLI using CMD.

To verify try to run "databricks fs ls", check whether you are able to see the DBFS.

在此处输入图像描述

Reference: Databricks CLI

You can use databricks cli to download files from databricks file system to local machine as follows;

dbfs cp dbfs:/myfolder/BRK4024.pptx A:DataSet\

Example: Since I have a sample BRK4024.pptx file in myfolder on dbfs, I'm using databricks cli command to copy to local machine folder name (A:Dataset)

在此处输入图像描述

Hope this helps.


If this answers your query, do click “Mark as Answer” and "Up-Vote" for the same. And, if you have any further query do let us know.

You need to use the Databricks CLI for this task.

Install the CLI on your local machine and run databricks configure to authenticate. Use an access token generated under user settings as the password.

Once you have the CLI installed and configured to your workspace, you can copy files to and from DBFS like this:

databricks fs cp dbfs:/path_to_file/my_file /path_to_local_file/my_file

You can also use the shorthand

dbfs cp dbfs:/path_to_file /path_to_local_file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM