简体   繁体   中英

mount S3 to databricks

I'm trying understand how mount works. I have a S3 bucket named myB , and a folder in it called test . I did a mount using

var AwsBucketName = "myB"
val MountName = "myB"

My question is that: does it create a link between S3 myB and databricks, and would databricks access all the files include the files under test folder? (or if I do a mount using var AwsBucketName = "myB/test" does it only link databricks to that folder test but not anyother files that outside of that folder?)

If so, how do I say list files in test folder, read that file or or count() a csv file in scala? I did a display(dbutils.fs.ls("/mnt/myB")) and it only shows the test folder but not files in it. Quite new here. Many thanks for your help!

From the Databricks documentation:

// Replace with your values
val AccessKey = "YOUR_ACCESS_KEY"
// Encode the Secret Key as that can contain "/"
val SecretKey = "YOUR_SECRET_KEY".replace("/", "%2F")
val AwsBucketName = "MY_BUCKET"
val MountName = "MOUNT_NAME"

dbutils.fs.mount(s"s3a://$AccessKey:$SecretKey@$AwsBucketName", s"/mnt/$MountName")
display(dbutils.fs.ls(s"/mnt/$MountName"))

If you are unable to see files in your mounted directory it is possible that you have created a directory under /mnt that is not a link to the s3 bucket. If that is the case try deleting the directory (dbfs.fs.rm) and remounting using the above code sample. Note that you will need your AWS credentials (AccessKey and SecretKey above). If you don't know them you will need to ask your AWS account admin for them.

It only lists the folders and files directly under bucket.

In S3

<bucket-name>/<Files & Folders>

In Databricks

/mnt/<MOUNT-NAME>/<Bucket-Data-List>

Just like below (Output for dbutils.fs.ls(s"/mnt/$MountName") )

dbfs:/mnt/<MOUNT-NAME>/Folder/  
dbfs:/mnt/<MOUNT-NAME>/file1.csv
dbfs:/mnt/<MOUNT-NAME>/file2.csv

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM