简体   繁体   中英

training google cloud ml engine actually in the cloud- clarification on the approach

I am trying to implement cloud based predictions for an sklearn model using google cloud ML engine. I am able to do this however it seems that even when using the REST API, it always references a trainer module that is actually trained offline /or on a standard python3 runtime that has sklearn installed, rather than any google service:

training_inputs = {'scaleTier': 'BASIC',
#'masterType': 'standard',
#'parameterServerType': 'large_model',
#'workerCount': 9,
#'parameterServerCount': 3,
'packageUris': ['gs://pathto/trainer/package/packages/trainer-0.0.0.tar.gz'],
'pythonModule': 'trainer.task',
'region': 'europe-west1',
'jobDir': ,
'runtimeVersion': '1.12',
'pythonVersion': '3.5'}

So, the way I see it, whether using gcloud (command line submission ) or the REST API via:

request = ml.projects().jobs().create(body=job_spec, parent=project_id)

The actual training is done by my python code running sklearn- ie the google cloud ML engine all it does is receive model specs from a sklearn model.bst file and then run the actual predictions. Is my understanding correct ? thanks for your help,

To answer your question, here is some background about ML Engine: the module referred in the command is the main module which starts whole training process. This process will include the training file and evaluation file in the code as in this example , and ML Engine will be in charge to create the model based on these files. Therefore, when submitting a training job to ML Engine, the train process will use ML Engine resources for each training step to create the model, which can be deployed into ML Engine for prediction.

For your question, ML Engine does not interfere the training datasets and model coding. That why it needs trainer module with the model specification and code. It provides resources for the model training and prediction, and manage the different version of the model. The diagram in this document should be a good reference for what ML Engine does.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM