[英]Tensorflow Model Transformer gives error on model_data from previous training step property in AWS Sagemaker Workflow Pipeline
I'm attempting to setup an AWS Sagemaker Pipeline that trains a tensorflow model and then, once appropriate acceptance criteria have been passed, will run a batch transformation step.我正在尝试设置一个 AWS Sagemaker Pipeline 来训练 tensorflow model,然后,一旦通过了适当的验收标准,将运行批量转换步骤。 This is all based off a tutorial presented by the Data Science On Aws group, though I have modified the code heavily (this step isn't in their original code).
这一切都基于Data Science On Aws小组提供的教程,尽管我已经大量修改了代码(这一步不在他们的原始代码中)。
Here is a portion of the relevant code:以下是相关代码的一部分:
...
training_step = TrainingStep(
name="Train",
estimator=estimator,
inputs={
"train": TrainingInput(
s3_data=processing_step.properties.ProcessingOutputConfig.Outputs["train"].S3Output.S3Uri,
content_type="text/csv",
),
"validation": TrainingInput(
s3_data=processing_step.properties.ProcessingOutputConfig.Outputs["validation"].S3Output.S3Uri,
content_type="text/csv",
),
"test": TrainingInput(
s3_data=processing_step.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri,
content_type="text/csv",
),
},
cache_config=cache_config,
)
...
inference_image_uri = sagemaker.image_uris.retrieve(
framework="tensorflow", # todo: edit
region=region,
version="2.3.1",
py_version="py37",
instance_type=deploy_instance_type,
image_scope="inference",
)
print('Inference image uri: ', inference_image_uri)
...
# Create Model for Deployment Step
model = Model(
name=model_name,
image_uri=inference_image_uri,
model_data=training_step.properties.ModelArtifacts.S3ModelArtifacts,
sagemaker_session=sess,
role=role,
)
create_inputs = CreateModelInput(
instance_type=deploy_instance_type,
)
create_step = CreateModelStep(
name=model_name,
model=model,
inputs=create_inputs,
)
# Transform Step for batch transform
batch_env = {
# Configures whether to enable record batching.
'SAGEMAKER_TFS_ENABLE_BATCHING': 'true',
# Name of the model - this is important in multi-model deployments
'SAGEMAKER_TFS_DEFAULT_MODEL_NAME': 'saved_model',
# Configures how long to wait for a full batch, in microseconds.
'SAGEMAKER_TFS_BATCH_TIMEOUT_MICROS': '50000', # microseconds
# Corresponds to "max_batch_size" in TensorFlow Serving.
'SAGEMAKER_TFS_MAX_BATCH_SIZE': '10000',
# Number of seconds for the SageMaker web server timeout
'SAGEMAKER_MODEL_SERVER_TIMEOUT': '7200', # Seconds
# Configures number of batches that can be enqueued.
'SAGEMAKER_TFS_MAX_ENQUEUED_BATCHES': '10000'
}
batch_transformer = model.transformer(
instance_type=deploy_instance_type.default_value,
instance_count=deploy_instance_count.default_value,
output_path=f"{raw_input_data_s3_uri}output/",
strategy='MultiRecord',
env=batch_env,
assemble_with='Line',
accept='text/csv',
max_concurrent_transforms=1,
max_payload=1, # This is in Megabytes (not number of records)
)
transform_inputs = TransformInput(
data=raw_input_data_s3_uri,
data_type='S3Prefix',
content_type='application/json',
split_type='Line',
compression_type='Gzip',
)
transform_step = TransformStep(
name=create_step.name,
transformer=batch_transformer,
cache_config=cache_config,
inputs=transform_inputs
)
...
And this is the error that I get when it tries to run the model.transformer
line:这是我在尝试运行
model.transformer
行时遇到的错误:
sgmkr_1 | Object of type 'Properties' is not JSON serializable: TypeError
sgmkr_1 | Traceback (most recent call last):
sgmkr_1 | File "/var/task/lambda_function.py", line 53, in lambda_handler
sgmkr_1 | model_package_group_name=model_package_group_name
sgmkr_1 | File "/var/task/pipeline_definition_template.py", line 722, in get_pipeline
sgmkr_1 | max_payload=1, # This is in Megabytes (not number of records)
sgmkr_1 | File "/var/lang/lib/python3.6/site-packages/sagemaker/model.py", line 842, in transformer
sgmkr_1 | self._create_sagemaker_model(instance_type, tags=tags)
sgmkr_1 | File "/var/lang/lib/python3.6/site-packages/sagemaker/model.py", line 331, in _create_sagemaker_model
sgmkr_1 | tags=tags,
sgmkr_1 | File "/var/lang/lib/python3.6/site-packages/sagemaker/session.py", line 2530, in create_model
sgmkr_1 | LOGGER.debug("CreateModel request: %s", json.dumps(create_model_request, indent=4))
sgmkr_1 | File "/var/lang/lib/python3.6/json/__init__.py", line 238, in dumps
sgmkr_1 | **kw).encode(obj)
sgmkr_1 | File "/var/lang/lib/python3.6/json/encoder.py", line 201, in encode
sgmkr_1 | chunks = list(chunks)
sgmkr_1 | File "/var/lang/lib/python3.6/json/encoder.py", line 430, in _iterencode
sgmkr_1 | yield from _iterencode_dict(o, _current_indent_level)
sgmkr_1 | File "/var/lang/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
sgmkr_1 | yield from chunks
sgmkr_1 | File "/var/lang/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
sgmkr_1 | yield from chunks
sgmkr_1 | File "/var/lang/lib/python3.6/json/encoder.py", line 437, in _iterencode
sgmkr_1 | o = _default(o)
sgmkr_1 | File "/var/lang/lib/python3.6/json/encoder.py", line 180, in default
sgmkr_1 | o.__class__.__name__)
sgmkr_1 | TypeError: Object of type 'Properties' is not JSON serializable
It would appear that it's trying to log the definition of the create model request which it can't do because it has a class object in the definition.似乎它正在尝试记录它无法执行的创建 model 请求的定义,因为它在定义中有 class object。 Since the logging is not likely crucial to the overall code, I wonder if this is just a bug or if there is something I can do to compensate for it.
由于日志记录对整个代码来说可能并不重要,我想知道这是否只是一个错误,或者我是否可以采取一些措施来弥补它。
The problem was the way I setup the transformer.问题是我设置变压器的方式。 I assumed I needed to create the dependency by calling the transform method on the model itself when I just needed to reference the name of the model created in the "create_step" in the transformer.
我假设我需要通过调用 model 本身的转换方法来创建依赖关系,而我只需要引用在转换器的“create_step”中创建的 model 的名称。 So instead of this:
所以代替这个:
batch_transformer = model.transformer(
instance_type=deploy_instance_type.default_value,
instance_count=deploy_instance_count.default_value,
output_path=f"{raw_input_data_s3_uri}output/",
strategy='MultiRecord',
env=batch_env,
assemble_with='Line',
accept='text/csv',
max_concurrent_transforms=1,
max_payload=1, # This is in Megabytes (not number of records)
)
I needed this:我需要这个:
batch_transformer = Transformer(
model_name=create_step.properties.ModelName,
instance_type=deploy_instance_type.default_value,
instance_count=deploy_instance_count.default_value,
output_path=f"{raw_input_data_s3_uri}output/",
strategy='MultiRecord',
env=batch_env,
assemble_with='Line',
accept='text/csv',
max_concurrent_transforms=1,
max_payload=1, # This is in Megabytes (not number of records)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.