简体   繁体   中英

How to capture output of a parallel state machine in AWS Step function

I need to run an AWS Step function that runs a parallel state machine running, say two state machines. My requirement is to check the final execution status of the parallel machine and if there is any failure, invoke an SNS service to send out an email. Pretty standard stuff but for the life of me, i can't figure out how to capture the combined error of a parallel step machine. This sample parallel machine runs

  1. A "passtask" that is just a simple lambda pass function, and
  2. Runs a failtask that has a sleep timer for 5 seconds and is suppposed to fail after 5 seconds.

If I execute this machine, this machine correctly shows passtask as succeeded, failtask as cancelled, Overall Parallel Task as succeeded (?????), Notify Failure task as cancelled and the overall execution of state machine as "failed" as well.

I'd like to see passtask as succeeded, fail task as failed, overall Parallel Task as Failed, Notify Failure task as succeeded.

{
  "Comment": "Parallel Example",
  "StartAt": "Parallel Task",
  "TimeoutSeconds": 120,
  "States": {
    "Parallel Task": {
      "Type": "Parallel",
      "Branches": [
       {
         "StartAt": "passtask",
         "States": {
           "passtask": {
             "Type": "Task",
             "Resource":"arn:xxxxxxxxxxxxxxx:function:passfunction",
             "End": true
           }
         }
       },
       {
         "StartAt": "failtask",
         "States": {
           "failtask": {
             "Type": "Task",
             "Resource":"arn: xxxxxxxxxxxxxxx:function:failfunction",
             "End": true
           }
         }
       }
      ],
      "ResultPath": "$.status",
      "Catch": [
        {
          "ErrorEquals": ["States.ALL"],
          "Next": "Notify Failure"
        }
      ],
      "Next": "Notify Success"
    },
    "Notify Failure": {
      "Type": "Pass",
      "InputPath": "$.input.Cause",
      "End": true
    },
    "Notify Success": {
      "Type": "Pass",
      "Result": "This is a fallback from a task success",
      "End": true
    }
  }
}

From your requirment "My requirement is to check the final execution status of the parallel machine and if there is any failure, invoke an SNS service to send out an email.", I understand that the "failtask" is just for debugging purposes and in the future it won't neccesarily fail. So the problem is, the moment Step Functions detect a failure in a branch all other branches are terminated and their outputs discarded, only the failed branch's output is used. So if you want to preserve the output of each Branch and check if a failure has occured, you will need to handle the errors in each branch and not report the whole branch as failed. Additionally you will need to add an output field to each branch which says if there was a failure or not (Choice State will give an error if a field does not exist). And also remember that the output of a ParralelState is an array with the output of each Branch, for example this State Machine should let each branch finish execution and handle the errors correctly:

{
    "Comment": "Parallel Example",
    "StartAt": "Parallel Task",
    "TimeoutSeconds": 120,
    "States": {
        "Parallel Task": {
            "Type": "Parallel",
            "Branches": [{
                    "StartAt": "passtask",
                    "States": {
                        "passtask": {
                            "Type": "Task",
                            "Resource": "arn:aws:lambda:us-east-1:XXXXXXXXXXXXXXXXX",
                            "Next": "SuccessBranch1",
                            "Catch": [{
                                "ErrorEquals": ["States.ALL"],
                                "Next": "FailBranch1"
                            }]
                        },
                        "SuccessBranch1": {
                            "Type": "Pass",
                            "Result": {
                                "Error": false
                            },
                            "ResultPath": "$.Status",
                            "End": true
                        },
                        "FailBranch1": {
                            "Type": "Pass",
                            "Result": {
                                "Error": true
                            },
                            "ResultPath": "$.Status",
                            "End": true
                        }
                    }

                },
                {
                    "StartAt": "failtask",
                    "States": {
                        "failtask": {
                            "Type": "Task",
                            "Resource": "arn:aws:lambda:us-east-1:XXXXXXXXXXXXXXXXX",
                            "Next": "SuccessBranch2",
                            "Catch": [{
                                "ErrorEquals": ["States.ALL"],
                                "Next": "FailBranch2"
                            }]
                        },
                        "SuccessBranch2": {
                            "Type": "Pass",
                            "Result": {
                                "Error": false
                            },
                            "ResultPath": "$.Status",
                            "End": true
                        },
                        "FailBranch2": {
                            "Type": "Pass",
                            "Result": {
                                "Error": true
                            },
                            "ResultPath": "$.Status",
                            "End": true
                        }
                    }
                }
            ],
            "ResultPath": "$.ParralelOutput",
            "Catch": [{
                "Comment": "This catch should never catch any errors, as the error handling is done in the individual Branches",
                "ErrorEquals": ["States.ALL"],
                "ResultPath": "$.ParralelOutput",
                "Next": "ChoiceStateX"
            }],
            "Next": "ChoiceStateX"
        },

        "ChoiceStateX": {
            "Type": "Choice",

            "Choices": [{
                "Or": [{
                        "Variable": "$.ParralelOutput[0].Status.Error",
                        "BooleanEquals": true
                    },
                    {
                        "Variable": "$.ParralelOutput[1].Status.Error",
                        "BooleanEquals": true
                    }
                ],
                "Next": "Notify Failure"
            }],
            "Default": "Notify Success"
        },

        "Notify Failure": {
            "Type": "Pass",
            "End": true
        },
        "Notify Success": {
            "Type": "Pass",
            "Result": "This is a fallback from a task success",
            "End": true
        }
    }
}

For a more general case (although more complex) of the above as asked by Nisman in the comments. Instead of hardcoding the Choice State to check for every branch we can add a pass state with some JSONPath tricks to check for conditions not currently possible with a choice state alone.

Inside this Pass State we use Parameters to restructure our data in such a way that when we apply a JSONPath filter expression to this data using the OutputPath we are left with an array of either 2 (if no branches failed) or 3 (if some branches failed) elements, where the first element always contains the original input data and the second/third contains at least 1 key with the same name to be used by the choice state. Here's the State Machine JSON:

{
  "Comment": "Parallel Example",
  "StartAt": "Parallel Task",
  "States": {
    "Parallel Task": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "passtask",
          "States": {
            "passtask": {
              "Type": "Task",
              "Resource": "<TASK RESOURCE>",
              "End": true,
              "Catch": [
                {
                  "ErrorEquals": [
                    "States.ALL"
                  ],
                  "ResultPath": "$.error-info",
                  "Next": "FailBranch1"
                }
              ]
            },
            "FailBranch1": {
              "Type": "Pass",
              "Parameters": {
                "BranchOutput.$": "$",
                "BranchError": true
              },
              "End": true
            }
          }
        },
        {
          "StartAt": "failtask",
          "States": {
            "failtask": {
              "Type": "Task",
              "Resource": "<TASK RESOURCE>",
              "End": true,
              "Catch": [
                {
                  "ErrorEquals": [
                    "States.ALL"
                  ],
                  "ResultPath": "$.error-info",
                  "Next": "FailBranch2"
                }
              ]
            },
            "FailBranch2": {
              "Type": "Pass",
              "Parameters": {
                "BranchOutput.$": "$",
                "BranchError": true
              },
              "End": true
            }
          }
        }
      ],
      "ResultPath": "$.ParralelOutput",
      "Next": "Pre-Process"
    },
    "Pre-Process": {
      "Type": "Pass",
      "Parameters": {
        "OrderedArray": [
          {
            "OriginalData": {
              "Input.$": "$",
              "ShouldFilterData": false
            }
          },
          {
            "ValuesToCheck": {
              "ListBranchErrors.$": "$.ParralelOutput[?(@.BranchError==true)].BranchError",
              "BranchFailures": true
            }
          },
          {
            "DefaultAlwaysFalse": {
              "ShouldFilterData": false,
              "BranchFailures": false
            }
          }
        ]
      },
      "OutputPath": "$..[?(@.ShouldFilterData == false || @.ListBranchErrors[0] == true)]",
      "Next": "ChoiceStateX"
    },
    "ChoiceStateX": {
      "Type": "Choice",
      "OutputPath": "$.[0].Input",
      "Choices": [
        {
          "Variable": "$[1].BranchFailures",
          "BooleanEquals": true,
          "Next": "NotifyFailure"
        },
        {
          "Variable": "$[1].BranchFailures",
          "BooleanEquals": false,
          "Next": "NotifySuccess"
        }
      ],
      "Default": "NotifyFailure"
    },
    "NotifyFailure": {
      "Type": "Pass",
      "End": true
    },
    "NotifySuccess": {
      "Type": "Pass",
      "Result": "This is a fallback from a task success",
      "End": true
    }
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM