简体   繁体   中英

how to access file or folders from azure ml service dataset?

Currently I am working on azure ml service where I have dataset in azure ml named as 'voice_recognition_expreimnt'. I access this dataset by this code:

file_dataset =  Dataset.get_by_name(workspace=ws, name='voice_recognition_expreimnt')

Now I want to access all file or folders in the dataset. So how can I traverse through all path in my dataset. I search a lot but I can't find any solution. So please help me

The answer depends on if you plan on doing work inside of compute instance notebook directly or submitting Runs via a ScriptRun , Estimator .

Direct access

You can use .downlad() to put the files on the machine on which you're currently working.

file_dataset.download()

Consumption via training runs

Below is a common patern in the Azure ML SDK to make datasets available to Runs , Estimators , PythonScriptSteps` and the like. All of these classes make it especially easy to run your code on your dataset on many compute targets.

src = ScriptRunConfig(
    source_directory=source_directory, 
    script='dummy_train.py',
    arguments=[file_dataset.as_named_input('input').as_mount(),
               output
        ]
)

exp = Experiment(ws, 'ScriptRun_sample')
run = exp.submit(config=src)

Here are a few tutorials that go into more detail.

  1. Creating and using a FileDataset within an Estimator
  2. How to use ScriptRun with data input and output notebook (the entire "datasets tutorial" folder is a great example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM