简体   繁体   中英

AWS Sagemaker: Can I pass a sagemaker.workflow.parameters.ParameterString to an SKLearnProcessor

I am working on creating a Sagemaker pipeline. In the evaluation step, I would like to pass an argument to my preprocess.py script.

There are a few examples online of how to do so (a sample below) but they all use static values. I want to pass a Workflow parameter (string in this case) to the script.

I tried multiple approaches but to no avail, and I even opened a Github Issue but received no response so far.

The linked Github Issue details all approaches I've taken so far, but it all boils down to the fact that a workflow parameter is only evaluated at runtime.

I would like to know if what I want to do is possible or not.

Option1: Typical approach: Passing Static values

sklearn_processor.run(
    code="preprocess.py",
    inputs = [
        ProcessingInput(source = 'my_package/', destination = '/opt/ml/processing/input/code/my_package/')
    ],
    outputs=[
        ProcessingOutput(output_name="test_transform_data", 
                         source = '/opt/ml/processing/output/test_transform',
                         destination = out_path),
    ],
    arguments=["--time-slot-minutes", "30min"]
)

source for the sample code: How to pass region to the SKLearnProcessor - botocore.exceptions.NoRegionError: You must specify a region

Option2: My approach: Passing Workflow Parameter

   step_args=myprocessor.run(
       inputs=[
            ProcessingInput(source=s3_full_address, destination="/opt/ml/processing/input"),
       ],
       outputs=[
        ProcessingOutput(output_name="raw", source="/opt/ml/processing/train"),
        ProcessingOutput(output_name="test", source="/opt/ml/processing/test"),
       ],
       code="generate_train_test_data.py",
       arguments=["--s3_prefix", s3_prefix]
   )

Where s3_prefix is a workflow argument defined as s3_prefix = ParameterString(name="InputPrefix", default_value="myprefix")

To pass a workflow argument to your script you can use the option job_arguments

1. Step defintion

Update your step definition to add the argument job_arguments

ProcessingStep(
    name="step-name",
    processor=my_processor,
    job_arguments=[
        "--my_argument",my_argument
    ],
    ...
    code=f"myscript.py"
)

2. Reading the argument

In your script ( myscript.py in this example), add ready the argument as follows:

def parse_args():
    parser = argparse.ArgumentParser()

    # hyperparameters sent by the client are passed as command-line arguments to the script
    parser.add_argument('--my_argument', type=str)

    return parser.parse_known_args()
    args, _ = parse_args()

args, _ = parse_args()    
my_argument = args.my_argument

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM