I'm moving my first steps in amazon sagemaker
. I'm using script mode to train a classification algorithm. Training is fine, however I'm not able to do incremental training. I want to train again the same model with new data. Here what I did. This is my script:
import sagemaker
from sagemaker.tensorflow import TensorFlow
from sagemaker import get_execution_role
bucket = 'sagemaker-blablabla'
train_data = 's3://{}/{}'.format(bucket,'train')
validation_data = 's3://{}/{}'.format(bucket,'test')
s3_output_location = 's3://{}'.format(bucket)
tf_estimator = TensorFlow(entry_point='main.py',
role=get_execution_role(),
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
framework_version='1.12',
py_version='py3',
output_path=s3_output_location)
inputs = {'train': train_data, 'test': validation_data}
tf_estimator.fit(inputs)
The entry point is my custom keras code, which I adapted to receive arguments from the script. Now the training is successfully completed and I have in my s3 bucket the model.tar.gz. I want to train again, but it's not clear to me how to do it.. I tried this
trained_model = 's3://sagemaker-blablabla/sagemaker-tensorflow-scriptmode-2019-11-27-12-01-42-300/output/model.tar.gz'
tf_estimator = sagemaker.estimator.Estimator(image_name='blablabla-west-1.amazonaws.com/sagemaker-tensorflow-scriptmode:1.12-gpu-py3',
role=get_execution_role(),
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
output_path=s3_output_location,
model_uri = trained_model)
inputs = {'train': train_data, 'test': validation_data}
tf_estimator.fit(inputs)
Doesn't work. Firstly, I don't know how to retrieve the training image name (for this I looked for it in the aws
console, but I guess there should be a smarter solution), second this code throws an exception about the entry point but it is my understanding that I shouldn't need it when I do incremental learning with a ready image. I'm surely missing something important, any help? Thank you!
Incremental training is a native feature for the built-in Image Classifier and Object Detector . For custom code, it is the developer responsibility to write the incremental training logic and to verify its validity. Here is a possible path:
fit
to load a model state (artifact to fine-tune)Some frameworks provide better support for incremental learning that others. For example some sklearn models provide an incremental_fit method. For DL frameworks it is technically very easy to continue training from a checkpoint, but if new data is very different from previously-seen data this may lead your model to forget previous learnings.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.