Tensorflow Model Transformer 在 AWS Sagemaker Workflow Pipeline 中的先前訓練步驟屬性的 model_data 上給出錯誤

Question

我正在嘗試設置一個 AWS Sagemaker Pipeline 來訓練 tensorflow model，然后，一旦通過了適當的驗收標准，將運行批量轉換步驟。 這一切都基於Data Science On Aws小組提供的教程，盡管我已經大量修改了代碼（這一步不在他們的原始代碼中）。

以下是相關代碼的一部分：

...

    training_step = TrainingStep(
        name="Train",
        estimator=estimator,
        inputs={
            "train": TrainingInput(
                s3_data=processing_step.properties.ProcessingOutputConfig.Outputs["train"].S3Output.S3Uri,
                content_type="text/csv",
            ),
            "validation": TrainingInput(
                s3_data=processing_step.properties.ProcessingOutputConfig.Outputs["validation"].S3Output.S3Uri,
                content_type="text/csv",
            ),
            "test": TrainingInput(
                s3_data=processing_step.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri,
                content_type="text/csv",
            ),
        },
        cache_config=cache_config,
    )

...

    inference_image_uri = sagemaker.image_uris.retrieve(
        framework="tensorflow",  # todo: edit
        region=region,
        version="2.3.1",
        py_version="py37",
        instance_type=deploy_instance_type,
        image_scope="inference",
    )
    print('Inference image uri: ', inference_image_uri)

...

    # Create Model for Deployment Step
    model = Model(
        name=model_name,
        image_uri=inference_image_uri,
        model_data=training_step.properties.ModelArtifacts.S3ModelArtifacts,
        sagemaker_session=sess,
        role=role,
    )

    create_inputs = CreateModelInput(
        instance_type=deploy_instance_type,
    )

    create_step = CreateModelStep(
        name=model_name,
        model=model,
        inputs=create_inputs,
    )

    # Transform Step for batch transform
    batch_env = {
        # Configures whether to enable record batching.
        'SAGEMAKER_TFS_ENABLE_BATCHING': 'true',

        # Name of the model - this is important in multi-model deployments
        'SAGEMAKER_TFS_DEFAULT_MODEL_NAME': 'saved_model',

        # Configures how long to wait for a full batch, in microseconds.
        'SAGEMAKER_TFS_BATCH_TIMEOUT_MICROS': '50000', # microseconds

        # Corresponds to "max_batch_size" in TensorFlow Serving.
        'SAGEMAKER_TFS_MAX_BATCH_SIZE': '10000',

        # Number of seconds for the SageMaker web server timeout
        'SAGEMAKER_MODEL_SERVER_TIMEOUT': '7200', # Seconds

        # Configures number of batches that can be enqueued.
        'SAGEMAKER_TFS_MAX_ENQUEUED_BATCHES': '10000'
    }

    batch_transformer = model.transformer(
        instance_type=deploy_instance_type.default_value,
        instance_count=deploy_instance_count.default_value,
        output_path=f"{raw_input_data_s3_uri}output/",
        strategy='MultiRecord',
        env=batch_env,
        assemble_with='Line',
        accept='text/csv',
        max_concurrent_transforms=1,
        max_payload=1,  # This is in Megabytes (not number of records)
    )

    transform_inputs = TransformInput(
        data=raw_input_data_s3_uri,
        data_type='S3Prefix',
        content_type='application/json',
        split_type='Line',
        compression_type='Gzip',
    )

    transform_step = TransformStep(
        name=create_step.name,
        transformer=batch_transformer,
        cache_config=cache_config,
        inputs=transform_inputs
    )

...

這是我在嘗試運行model.transformer行時遇到的錯誤：

sgmkr_1  | Object of type 'Properties' is not JSON serializable: TypeError
sgmkr_1  | Traceback (most recent call last):
sgmkr_1  |   File "/var/task/lambda_function.py", line 53, in lambda_handler
sgmkr_1  |     model_package_group_name=model_package_group_name
sgmkr_1  |   File "/var/task/pipeline_definition_template.py", line 722, in get_pipeline
sgmkr_1  |     max_payload=1,  # This is in Megabytes (not number of records)
sgmkr_1  |   File "/var/lang/lib/python3.6/site-packages/sagemaker/model.py", line 842, in transformer
sgmkr_1  |     self._create_sagemaker_model(instance_type, tags=tags)
sgmkr_1  |   File "/var/lang/lib/python3.6/site-packages/sagemaker/model.py", line 331, in _create_sagemaker_model
sgmkr_1  |     tags=tags,
sgmkr_1  |   File "/var/lang/lib/python3.6/site-packages/sagemaker/session.py", line 2530, in create_model
sgmkr_1  |     LOGGER.debug("CreateModel request: %s", json.dumps(create_model_request, indent=4))
sgmkr_1  |   File "/var/lang/lib/python3.6/json/__init__.py", line 238, in dumps
sgmkr_1  |     **kw).encode(obj)
sgmkr_1  |   File "/var/lang/lib/python3.6/json/encoder.py", line 201, in encode
sgmkr_1  |     chunks = list(chunks)
sgmkr_1  |   File "/var/lang/lib/python3.6/json/encoder.py", line 430, in _iterencode
sgmkr_1  |     yield from _iterencode_dict(o, _current_indent_level)
sgmkr_1  |   File "/var/lang/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
sgmkr_1  |     yield from chunks
sgmkr_1  |   File "/var/lang/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
sgmkr_1  |     yield from chunks
sgmkr_1  |   File "/var/lang/lib/python3.6/json/encoder.py", line 437, in _iterencode
sgmkr_1  |     o = _default(o)
sgmkr_1  |   File "/var/lang/lib/python3.6/json/encoder.py", line 180, in default
sgmkr_1  |     o.__class__.__name__)
sgmkr_1  | TypeError: Object of type 'Properties' is not JSON serializable

似乎它正在嘗試記錄它無法執行的創建 model 請求的定義，因為它在定義中有 class object。 由於日志記錄對整個代碼來說可能並不重要，我想知道這是否只是一個錯誤，或者我是否可以采取一些措施來彌補它。

Answer 1

問題是我設置變壓器的方式。 我假設我需要通過調用 model 本身的轉換方法來創建依賴關系，而我只需要引用在轉換器的“create_step”中創建的 model 的名稱。 所以代替這個：

    batch_transformer = model.transformer(
        instance_type=deploy_instance_type.default_value,
        instance_count=deploy_instance_count.default_value,
        output_path=f"{raw_input_data_s3_uri}output/",
        strategy='MultiRecord',
        env=batch_env,
        assemble_with='Line',
        accept='text/csv',
        max_concurrent_transforms=1,
        max_payload=1,  # This is in Megabytes (not number of records)
    )

我需要這個：

    batch_transformer = Transformer(
        model_name=create_step.properties.ModelName,
        instance_type=deploy_instance_type.default_value,
        instance_count=deploy_instance_count.default_value,
        output_path=f"{raw_input_data_s3_uri}output/",
        strategy='MultiRecord',
        env=batch_env,
        assemble_with='Line',
        accept='text/csv',
        max_concurrent_transforms=1,
        max_payload=1,  # This is in Megabytes (not number of records)
    )

Tensorflow Model Transformer 在 AWS Sagemaker Workflow Pipeline 中的先前訓練步驟屬性的 model_data 上給出錯誤

問題描述

1 個解決方案

解決方案1
0 2021-04-02 18:58:40

Tensorflow Model Transformer 在 AWS Sagemaker Workflow Pipeline 中的先前訓練步驟屬性的 model_data 上給出錯誤

問題描述

1 個解決方案

解決方案1 0 2021-04-02 18:58:40

解決方案1
0 2021-04-02 18:58:40