简体   繁体   中英

JSONata, how to transform array with elements missing?

I am trying to convert an 2D array, let's say rows, into an array of the same rows, but for some rows one attribute is missing (omitted) and needs to be taken from one of the previous rows for which the attribute was filled.

Unfortunately, I cannot find a way to access the previous rows or to store the attribute in a variable that is persistent.

I hope somebody can give me a hint, how this is best achieved.

Let's assume the data looks like this:

{
  "payload": [
    {
      "Name": "Name1",
      "Values": "MsgId1.3 / Type1 / "
    },
    {
      "Values": "MsgId1.1 / Type3 / COMP"
    },
    {
      "Name": "Name2",
      "Values": "MsgId2.6 / Type1 / COMP"
    },
    {
      "Values": "MsgId2.5 / Type4 / COMP"
    },
    {
      "Values": "MsgId2.4 / Type4 / REJT"
    },
    {
      "Name": "Name3",
      "Values": "MsgId3.2 / Type7 / "
    }
  ]
}

The expected result looks like this:

{
  "list": [
    {
      "NAME": "Name1",
      "MSG_ID": "MsgId1.3",
      "MSG_TYPE": "Type1",
      "MSG_STATUS": ""
    },
    {
      "NAME": "Name1",   /* <-- this line is missing in my results */
      "MSG_ID": "MsgId1.1",
      "MSG_TYPE": "Type3",
      "MSG_STATUS": "COMP"
    },
    {
      "NAME": "Name2",
      "MSG_ID": "MsgId2.6",
      "MSG_TYPE": "Type1",
      "MSG_STATUS": "COMP"
    },
    {
      "NAME": "Name2",   /* <-- this line is missing in my results */
      "MSG_ID": "MsgId2.5",
      "MSG_TYPE": "Type4",
      "MSG_STATUS": "COMP"
    },
    {
      "NAME": "Name2",   /* <-- this line is missing in my results */
      "MSG_ID": "MsgId2.4",
      "MSG_TYPE": "Type4",
      "MSG_STATUS": "REJT"
    },
    {
      "NAME": "Name3",
      "MSG_ID": "MsgId3.2",
      "MSG_TYPE": "Type7",
      "MSG_STATUS": ""
    }
  ]
}

My last JSONata is like this , but it does not work.

{
  "list": [
    $.payload
      .(
        $values := $.Values ~> $split(" / ");
        $name := ( ($.Name) ? ($.Name) : $name );
        {
          "NAME": $name,
          "MSG_ID": $values[0],
          "MSG_TYPE": $values[1],
          "MSG_STATUS": $values[2]
        }
      )
  ]
}

I also tried with $each() function, but to no avail.

I haven't measured the performance of this solution, but one idea would be to accumulate the last used name in the accumulator of the $reduce function to fill in the blanks when populating the list: https://stedi.link/Uhi76Yq

Similar idea, but based on the $index argument passed down to the callback argument of the $reduce function: https://stedi.link/952AyV9 (maybe that one would be faster to run).

I found a very inefficient solution that seems to work by using the $filter() function.

EDIT: Unfortunately, the runtime for everything with more than a few rows (I will have n-thousand) is minutes. So far it gives me the correct result, but that way I cannot use it. Keeping it, because it is at least a solution.

$names:= ([$.payload#$i.({ "pos": $i, "name": $.Name })[name]])^(<pos); filters the original data for all rows that have a name and stores the index = "pos" = $i with it.

$prevname:= $filter($names, function($v, $j, $a) { $v.pos <= $i })[-1].name; filters the array $names for all rows with index <= current index ( $v.pos <= $i ) and only returns the last row's .name .

(
  $names := ([$.payload#$i.({ "pos": $i, "name": $.Name })[name]])^(<pos);
  {
    "list": [
      $
        .payload#$i
        .(
          $values := $.Values ~> $split(" / ");
          $prevname := $filter($names, function($v, $j, $a) { $v.pos <= $i })[-1].name;
          {
            "NAME": ($.Name) ? ($.Name) : $prevname,
            "MSG_ID": $values[0],
            "MSG_TYPE": $values[1],
            "MSG_STATUS": $values[2]
          }
        )
    ]
  }
)

Example with solution can be found here .

Rather than use the mapping operator ( . ) which has no defined order of execution, you need to write a recursive function to step through each row imperatively. You can then pass in the $name from the previous row for use as the default name for the next row. The following expression does this:

(
    $row := function($name, $payload) {(
        $first := $payload[0];
        $rest := $payload[[1..$count($payload)-1]];
        $values := $first.Values ~> $split(" / ");
        $name := ( ($first.Name) ? ($first.Name) : $name );
        [
            {
            "NAME": $name,
            "MSG_ID": $values[0],
            "MSG_TYPE": $values[1],
            "MSG_STATUS": $values[2]
            },
            $rest ? $row($name, $rest)
        ]
    )};
    {
        "list": $row("", payload)
    }
)

See https://try.jsonata.org/g7bcZKHKl

If your dataset is large enough, then this will eventually exceed the stack limit. If this happens, you need to rewrite the function slightly to be tail recursive. See https://docs.jsonata.org/programming#tail-call-optimization-tail-recursion

Andrew Coleman's answer put me in the right direction.

My final solution can deal with large data sets. It runs quick even with thousands of objects in the payload array.

Example with solution can be found here .

(
  $previous := function($index) {
    ($index > 0) ?
      ($.payload[$index - 1].Name ?
        $.payload[$index - 1].Name :
        $previous($index - 1)) : 
      "" };

  {
    "list": $
      .payload#$i
      .(
        $values := $.Values ~> $split(" / ");
        {
          "pos": (0) ? $i,
          "NAME": ($.Name) ? ($.Name) : $previous($i),
          "MSG_ID": $values[0],
          "MSG_TYPE": $values[1],
          "MSG_STATUS": $values[2]
        }
      )
  }
)

.payload#$i uses Positional Variable Binding ( #$i ) to give me the index of the row of the original array payload .

"NAME": ($.Name)? ($.Name): $previous($i), "NAME": ($.Name)? ($.Name): $previous($i), checks if the Name is filled in the current row, if not a previous Name is searched via function $previous .

Function $previous checks if the Name of the previous row is filled and returns it. Otherwise it calls itself recursively until it finds the row above with Name filled.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM