简体   繁体   中英

Can't access mounted volume with python on Databricks

I am trying to give access to an Azure Storage Account Gen2 container to a team in their Databricks workspace by mounting it to a the dbfs, using Credential Passthrough. I want to be able to manage the access with the Active Directory, since eventually, there are containers to be mounted in readonly.

I based my code on this tutorial : https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/adls-passthrough#adls-aad-credentials

Extract from my conf :

"spark_conf": {
        "spark.databricks.cluster.profile": "serverless",
        "spark.databricks.passthrough.enabled": "true",
        "spark.databricks.delta.preview.enabled": "true",
        "spark.databricks.pyspark.enableProcessIsolation": "true",
        "spark.databricks.repl.allowedLanguages": "python,sql"
    }

I then run the following code :

dbutils.fs.mount(
  source = f"wasbs://data@storage_account_name.blob.core.windows.net",
  mount_point = "/mnt/data/",
  extra_configs = {
  "fs.azure.account.auth.type":"CustomAccessToken",
  "fs.azure.account.custom.token.provider.class":spark.conf.get("spark.databricks.passthrough.adls.gen2.tokenProviderClassName")
}

That is a success as I can access the volume with dbutils.

>> dbutils.fs.ls('dbfs:/mnt/storage_account_name/data')
[FileInfo(path='dbfs:/mnt/storage_account_name/data/folder/', name='folder/', size=0)]

My issue is when I run either %sh ls /dbfs/mnt/storage_account_name/data or try to access it with python

>> import os 
>> os.listdir('/dbfs/')
Out[1]: []

>> os.listdir('/dbfs/mnt/')
FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/'

I can't find out what I am missing. Is there something to configure to make it accessible to python ? Thanks.

There are certain limitations when you are using credential passthrough option, which is why it did not work. There is no issue of syntax. See this offical doc to understand.

Answer is plain simple.

Local file API Limitations

The following list enumerates the limitations in local file API usage that apply to each Databricks Runtime version.

 All - Does not support credential passthrough.

Source : https://docs.microsoft.com/en-us/azure/databricks/data/databricks-file-system#local-file-apis

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM