如何在“AWS Step Functions”中共享数据而不在步骤之间传递数据

Question

I use AWS Step Functions and have the following workflow我使用AWS Step Functions并具有以下工作流程

initStep - It's a lambda function handler, that gets some data and sends it to SQS for external service. initStep - 这是一个 lambda 函数处理程序，它获取一些数据并将其发送到SQS以供外部服务。

activity = os.getenv('ACTIVITY')
queue_name = os.getenv('QUEUE_NAME')

def lambda_handler(event, context):
  event['my_activity'] = activity
  data = json.dumps(event)

  # Retrieving a queue by its name
  sqs = boto3.resource('sqs')
  queue = sqs.get_queue_by_name(QueueName=queue_name)

  queue.send_message(MessageBody=data, MessageGroupId='messageGroup1' + str(datetime.time(datetime.now())))

  return event

validationWaiting - It's an activity that waits for an answer from the external service that include the data. validationWaiting - 这是一个等待来自包含数据的外部服务的答案的activity 。

complete - It's a lambda function handler, that uses the data from the initStep . complete - 它是一个 lambda 函数处理程序，它使用来自initStep的数据。

def lambda_handler(event, context):
  email = event['email'] if 'email' in event else None
  data = event['data'] if 'data' in event else None

  client = boto3.client(service_name='ses')
  to = email.split(', ')
  message_conrainer = {'Subject': {'Data': 'Email from step functions'},
           'Body': {'Html': {
               'Charset': "UTF-8",
               'Data': """<html><body>
                            <p>""" + data """</p>
                            </body> </html> """
           }}}

  destination = {'ToAddresses': to,
               'CcAddresses': [],
               'BccAddresses': []}

  return client.send_email(Source=from_addresses,
                         Destination=destination,
                         Message=message_container)

It does work, but the problem is that I'm sending full data from the initStep to external service, just to pass it later to complete .它确实有效，但问题是我将完整数据从initStep发送到外部服务，只是为了稍后将其传递给complete 。 Potentially more steps can be added.可能会添加更多步骤。

I believe it would be better to share it as some sort of global data (of current step function), that way I could add or remove steps and data would still be available for all.我相信最好将其作为某种全局数据（当前步骤函数的）共享，这样我就可以添加或删除步骤，并且数据仍然可供所有人使用。

Answer 1

You can make use of InputPath and ResultPath .您可以使用InputPath和ResultPath 。 In initStep you would only send necessary data to external service (probably along with some unique identifier of Execution).在initStep您只会将必要的数据发送到外部服务（可能还有一些唯一的执行标识符）。 In the ValidaitonWaiting step you can set following properties (in State Machine definition):在ValidaitonWaiting步骤中，您可以设置以下属性（在状态机定义中）：

InputPath : What data will be provided to GetActivityTask . InputPath ：将提供给GetActivityTask数据。 Probably you want to set it to something like $.execution_unique_id where execution_unique_id is field in your data that external service uses to identify Execution (to match it with specific request during initStep ).可能您想将其设置为$.execution_unique_id ，其中execution_unique_id是外部服务用于识别执行的数据中的字段（以在initStep期间将其与特定请求initStep ）。
ResultPath : Where output of ValidationWaiting Activity will be saved in data. ResultPath : ValidationWaiting Activity 的输出将保存在数据中的位置。 You can set it to $.validation_output and json result from external service will be present there.您可以将其设置为$.validation_output并且来自外部服务的 json 结果将出现在那里。

This way you can send to external service only data that is actually needed by it and you won't lose access to any data that was previously (before ValidationWaiting step) in the input.通过这种方式，您可以仅将外部服务实际需要的数据发送到外部服务，并且您不会失去对之前（在ValidationWaiting步骤之前）输入中的任何数据的访问权限。

For example, you could have following definition of the State Machine:例如，您可以对状态机进行以下定义：

{
  "StartAt": "initStep",
  "States": {
    "initStep": {
      "Type": "Pass",
      "Result": {
        "executionId": "some:special:id",
        "data": {},
        "someOtherData": {"value": "key"}
      },
      "Next": "ValidationWaiting"
    },
    "ValidationWaiting": {
      "Type": "Pass",
      "InputPath": "$.executionId",
      "ResultPath": "$.validationOutput",
      "Result": {
        "validationMessages": ["a", "b"]
      },
      "Next": "Complete"
    },
    "Complete": {
      "Type": "Pass",
      "End": true
    }
  }
}

I've used Pass states for initStep and ValidationWaiting to simplify the example (I haven't run it, but it should work).我已经使用initStep和ValidationWaiting Pass状态来简化示例（我没有运行它，但它应该可以工作）。 Result field is specific to Pass task and it is equivalent to the result of your Lambda functions or Activity. Result字段特定于Pass任务，它相当于您的 Lambda 函数或活动的结果。

In this scenario Complete step would get following input:在这种情况下， Complete步骤将获得以下输入：

{
  "executionId": "some:special:id",
  "data": {},
  "someOtherData": {"value": key"},
  "validationOutput": {
    "validationMessages": ["a", "b"]
  }
}

So the result of ValidationWaiting step has been saved into validationOutput field.因此ValidationWaiting步骤的结果已保存到validationOutput字段中。

Answer 2

Based on the answer of Marcin Sucharski I've came up with my own solution.根据Marcin Sucharski的回答，我提出了自己的解决方案。

I needed to use Type: Task since initStep is a lambda, which sends SQS.我需要使用Type: Task因为initStep是一个 lambda，它发送 SQS。

I didn't needed InputPath in ValidationWaiting , but only ResultPath , which store the data received in activity.我不需要ValidationWaiting InputPath ，而只需要ResultPath ，它存储活动中收到的数据。

I work with Serverless framework, here is my final solution:我使用无服务器框架，这是我的最终解决方案：

StartAt: initStep
States: 
  initStep:
    Type: Task
    Resource: arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:init-step
    Next: ValidationWaiting
  ValidationWaiting:
    Type: Task
    ResultPath: $.validationOutput
    Resource: arn:aws:states:#{AWS::Region}:#{AWS::AccountId}:activity:validationActivity
    Next: Complete
    Catch:
      - ErrorEquals:
        - States.ALL
      ResultPath: $.validationOutput
      Next: Complete
  Complete:
    Type: Task
    Resource: arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:complete-step
    End: true

Answer 3

Here a short and simple solution with InputPath and ResultPath.这是一个简短而简单的 InputPath 和 ResultPath 解决方案。 My Lambda Check_Ubuntu_Updates return a list of instance ready to be updated.我的 Lambda Check_Ubuntu_Updates 返回准备更新的实例列表。 This list of instances is received by the step Notify_Results, then it use this data.此实例列表由步骤 Notify_Results 接收，然后使用此数据。 Remember that if you have several ResultPath in your Step Function and you need more than 1 input in a step you can use InputPath only with $.请记住，如果您的 Step Function 中有多个 ResultPath 并且您在一个步骤中需要超过 1 个输入，则您只能将 InputPath 与 $ 一起使用。

{
  "Comment": "A state machine that check some updates systems available.",
  "StartAt": "Check_Ubuntu_Updates",
  "States": {
    "Check_Ubuntu_Updates": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:#############:function:Check_Ubuntu_Updates",
      "ResultPath": "$.instances",
      "Next": "Notify_Results"
    },
    "Notify_Results": {
      "Type": "Task",
      "InputPath": "$.instances",
      "Resource": "arn:aws:lambda:us-east-1:#############:function:Notify_Results",
      "End": true
    }
  }
}

如何在“AWS Step Functions”中共享数据而不在步骤之间传递数据

问题描述

3 个解决方案

解决方案1
4 2019-02-24 12:50:03

解决方案2
3 已采纳 2019-02-27 21:19:36

解决方案3
1 2019-02-26 17:55:04

如何在“AWS Step Functions”中共享数据而不在步骤之间传递数据

问题描述

3 个解决方案

解决方案1 4 2019-02-24 12:50:03

解决方案2 3 已采纳 2019-02-27 21:19:36

解决方案3 1 2019-02-26 17:55:04

解决方案1
4 2019-02-24 12:50:03

解决方案2
3 已采纳 2019-02-27 21:19:36

解决方案3
1 2019-02-26 17:55:04