简体   繁体   中英

How to train your own model in AWS Sagemaker?

I just started with AWS and I want to train my own model with own dataset. I have my model as keras model with tensorflow backend in Python. I read some documentations, they say I need a Docker image to load my model. So, how do I convert keras model into Docker image. I searched through internet but found nothing that explained the process clearly. How to make docker image of keras model, how to load it to sagemaker. And also how to load my data from a h5 file into S3 bucket for training? Can anyone please help me in getting clear explanation?

Although you can load a Docker container into Sagemaker for production, it sounds like you would be better served by completing the entire Sagemaker pipeline, starting with your data in S3 and training via Jupyter notebook, which supports keras & TF.

Once you have a model trained the documentation walks through how to store and persist the model for production. For Docker, you would build your Docker container and push it to AWS ECR and from there import -- note that from awslabs examples there is a very specific Docker directory structure you need to follow (ex:

https://github.com/awslabs/amazon-sagemaker-examples/blob/caa8ce243b51f6bdb15f2afc638d9c4e2ad436b9/hyperparameter_tuning/keras_bring_your_own/trainer/environment.py ).

You can convert your Keras model to a tf.estimator and train using the TensorFlow framework estimators in Sagemaker.

This conversion is pretty basic though, I reimplemented my models in TensorFlow using the tf.keras API which makes the model nearly identical and train with the Sagemaker TF estimator in script mode.

My initial approach using pure Keras models was based on bring-your-own-algo containers similar to the answer by Matthew Arthur.

A good starting point to writing your own custom algorithms is the Scikit Building Your Own Algorithm Container tutorial. It gives you an overview of Docker, steps for packaging your script into a container, uploading and running a training job.

If you already have a hdf5 file, you can just use the AWS CLI to upload it to a bucket owned by you.

$ aws s3 cp ./path/to/file.h5 s3://my-sagemaker-bucket/folder/file.h5

Then, when creating your training job, you can specify an [input channel]:( http://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateTrainingJob.html#SageMaker-CreateTrainingJob-request-InputDataConfig )

[{ 
  "ChannelName": "train",
  "DataSource": {
    "S3DataSource": {
      "S3Uri": "s3://my-sagemaker-bucket/folder",
      "S3DataType": "S3Prefix",
      "S3DataDistributionType": "FullyReplicated"
    }
  }
}]

When the training job begins, your containerized script should be able to find it on its local filesystem at /opt/ml/input/data/train/file.h5 , and be able to read it like a normal file. Note that "train" in this file path corresponds to the channel name you specified in the input-data-config.

You can read more at https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html

Keras is now natively supported in SageMaker, with both the TensorFlow and MXNet built-in frameworks. You can train and deploy with SageMaker, or you can import existing Keras models in TensorFlow Serving format and deploy them.

Here's a detailed example: https://aws.amazon.com/blogs/machine-learning/train-and-deploy-keras-models-with-tensorflow-and-apache-mxnet-on-amazon-sagemaker/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM