JSONata，如何转换缺少元素的数组？

Question

I am trying to convert an 2D array, let's say rows, into an array of the same rows, but for some rows one attribute is missing (omitted) and needs to be taken from one of the previous rows for which the attribute was filled.我正在尝试将二维数组（比如行）转换为相同行的数组，但对于某些行，缺少一个属性（省略），需要从填充了属性的前一行中获取。

Unfortunately, I cannot find a way to access the previous rows or to store the attribute in a variable that is persistent.不幸的是，我找不到访问前几行或将属性存储在持久变量中的方法。

I hope somebody can give me a hint, how this is best achieved.我希望有人能给我一个提示，如何最好地实现这一点。

Let's assume the data looks like this:假设数据如下所示：

{
  "payload": [
    {
      "Name": "Name1",
      "Values": "MsgId1.3 / Type1 / "
    },
    {
      "Values": "MsgId1.1 / Type3 / COMP"
    },
    {
      "Name": "Name2",
      "Values": "MsgId2.6 / Type1 / COMP"
    },
    {
      "Values": "MsgId2.5 / Type4 / COMP"
    },
    {
      "Values": "MsgId2.4 / Type4 / REJT"
    },
    {
      "Name": "Name3",
      "Values": "MsgId3.2 / Type7 / "
    }
  ]
}

The expected result looks like this:预期结果如下所示：

{
  "list": [
    {
      "NAME": "Name1",
      "MSG_ID": "MsgId1.3",
      "MSG_TYPE": "Type1",
      "MSG_STATUS": ""
    },
    {
      "NAME": "Name1",   /* <-- this line is missing in my results */
      "MSG_ID": "MsgId1.1",
      "MSG_TYPE": "Type3",
      "MSG_STATUS": "COMP"
    },
    {
      "NAME": "Name2",
      "MSG_ID": "MsgId2.6",
      "MSG_TYPE": "Type1",
      "MSG_STATUS": "COMP"
    },
    {
      "NAME": "Name2",   /* <-- this line is missing in my results */
      "MSG_ID": "MsgId2.5",
      "MSG_TYPE": "Type4",
      "MSG_STATUS": "COMP"
    },
    {
      "NAME": "Name2",   /* <-- this line is missing in my results */
      "MSG_ID": "MsgId2.4",
      "MSG_TYPE": "Type4",
      "MSG_STATUS": "REJT"
    },
    {
      "NAME": "Name3",
      "MSG_ID": "MsgId3.2",
      "MSG_TYPE": "Type7",
      "MSG_STATUS": ""
    }
  ]
}

My last JSONata is like this , but it does not work.我的最后一个 JSONata 是这样的，但它不起作用。

{
  "list": [
    $.payload
      .(
        $values := $.Values ~> $split(" / ");
        $name := ( ($.Name) ? ($.Name) : $name );
        {
          "NAME": $name,
          "MSG_ID": $values[0],
          "MSG_TYPE": $values[1],
          "MSG_STATUS": $values[2]
        }
      )
  ]
}

I also tried with $each() function, but to no avail.我也尝试使用 $each() function，但无济于事。

Answer 1

I haven't measured the performance of this solution, but one idea would be to accumulate the last used name in the accumulator of the $reduce function to fill in the blanks when populating the list: https://stedi.link/Uhi76Yq我没有测量此解决方案的性能，但一个想法是在$reduce function 的累加器中累积最后使用的名称，以在填充列表时填写空白： https://stedi.link/Uhi76Yq

Similar idea, but based on the $index argument passed down to the callback argument of the $reduce function: https://stedi.link/952AyV9 (maybe that one would be faster to run).类似的想法，但基于$index参数传递给$reduce function: https://stedi.link/952AyV9的回调参数（也许运行起来会更快）。

Answer 2

I found a very inefficient solution that seems to work by using the $filter() function.我发现一个非常低效的解决方案似乎可以通过使用$filter() function 来工作。

EDIT: Unfortunately, the runtime for everything with more than a few rows (I will have n-thousand) is minutes.编辑：不幸的是，所有行数超过几行（我将有 n 千行）的运行时间是几分钟。 So far it gives me the correct result, but that way I cannot use it.到目前为止，它给了我正确的结果，但那样我就无法使用它。 Keeping it, because it is at least a solution.保留它，因为它至少是一个解决方案。

$names:= ([$.payload#$i.({ "pos": $i, "name": $.Name })[name]])^(<pos); filters the original data for all rows that have a name and stores the index = "pos" = $i with it.过滤所有具有名称的行的原始数据并将 index = "pos" = $i与它一起存储。

$prevname:= $filter($names, function($v, $j, $a) { $v.pos <= $i })[-1].name; filters the array $names for all rows with index <= current index ( $v.pos <= $i ) and only returns the last row's .name .过滤具有索引 <= 当前索引 ( $v.pos <= $i ) 的所有行的数组$names并且只返回最后一行的.name 。

(
  $names := ([$.payload#$i.({ "pos": $i, "name": $.Name })[name]])^(<pos);
  {
    "list": [
      $
        .payload#$i
        .(
          $values := $.Values ~> $split(" / ");
          $prevname := $filter($names, function($v, $j, $a) { $v.pos <= $i })[-1].name;
          {
            "NAME": ($.Name) ? ($.Name) : $prevname,
            "MSG_ID": $values[0],
            "MSG_TYPE": $values[1],
            "MSG_STATUS": $values[2]
          }
        )
    ]
  }
)

Example with solution can be found here .可以在此处找到解决方案示例。

Answer 3

Rather than use the mapping operator ( . ) which has no defined order of execution, you need to write a recursive function to step through each row imperatively.您需要编写一个递归的 function 来命令式地遍历每一行，而不是使用没有定义执行顺序的映射运算符 ( . )。 You can then pass in the $name from the previous row for use as the default name for the next row.然后，您可以传入前一行的$name以用作下一行的默认名称。 The following expression does this:以下表达式执行此操作：

(
    $row := function($name, $payload) {(
        $first := $payload[0];
        $rest := $payload[[1..$count($payload)-1]];
        $values := $first.Values ~> $split(" / ");
        $name := ( ($first.Name) ? ($first.Name) : $name );
        [
            {
            "NAME": $name,
            "MSG_ID": $values[0],
            "MSG_TYPE": $values[1],
            "MSG_STATUS": $values[2]
            },
            $rest ? $row($name, $rest)
        ]
    )};
    {
        "list": $row("", payload)
    }
)

See https://try.jsonata.org/g7bcZKHKl见https://try.jsonata.org/g7bcZKHKl

If your dataset is large enough, then this will eventually exceed the stack limit.如果您的数据集足够大，那么这最终将超过堆栈限制。 If this happens, you need to rewrite the function slightly to be tail recursive.如果发生这种情况，您需要稍微重写 function 以进行尾递归。 See https://docs.jsonata.org/programming#tail-call-optimization-tail-recursion见https://docs.jsonata.org/programming#tail-call-optimization-tail-recursion

Answer 4

Andrew Coleman's answer put me in the right direction.安德鲁科尔曼的回答让我朝着正确的方向前进。

My final solution can deal with large data sets.我的最终解决方案可以处理大型数据集。 It runs quick even with thousands of objects in the payload array.即使负载数组中有数千个对象，它也能快速运行。

Example with solution can be found here .可以在此处找到解决方案示例。

(
  $previous := function($index) {
    ($index > 0) ?
      ($.payload[$index - 1].Name ?
        $.payload[$index - 1].Name :
        $previous($index - 1)) : 
      "" };

  {
    "list": $
      .payload#$i
      .(
        $values := $.Values ~> $split(" / ");
        {
          "pos": (0) ? $i,
          "NAME": ($.Name) ? ($.Name) : $previous($i),
          "MSG_ID": $values[0],
          "MSG_TYPE": $values[1],
          "MSG_STATUS": $values[2]
        }
      )
  }
)

.payload#$i uses Positional Variable Binding ( #$i ) to give me the index of the row of the original array payload . .payload#$i使用位置变量绑定( #$i ) 为我提供原始数组payload的行的索引。

"NAME": ($.Name)? ($.Name): $previous($i), "NAME": ($.Name)? ($.Name): $previous($i), checks if the Name is filled in the current row, if not a previous Name is searched via function $previous . "NAME": ($.Name)? ($.Name): $previous($i),检查当前行是否填写了Name ，如果没有，则通过 function $previous搜索先前的名称。

Function $previous checks if the Name of the previous row is filled and returns it. Function $previous检查上一行的Name是否已填写并返回。 Otherwise it calls itself recursively until it finds the row above with Name filled.否则，它会递归调用自身，直到找到上面填充了Name的行。

JSONata，如何转换缺少元素的数组？

问题描述

4 个解决方案

解决方案1
2 2022-09-06 08:46:04

解决方案2
0 2022-09-05 16:38:03

解决方案3
0 2022-09-06 08:55:30

解决方案4
0 2022-09-06 22:56:56

JSONata，如何转换缺少元素的数组？

问题描述

4 个解决方案

解决方案1 2 2022-09-06 08:46:04

解决方案2 0 2022-09-05 16:38:03

解决方案3 0 2022-09-06 08:55:30

解决方案4 0 2022-09-06 22:56:56

解决方案1
2 2022-09-06 08:46:04

解决方案2
0 2022-09-05 16:38:03

解决方案3
0 2022-09-06 08:55:30

解决方案4
0 2022-09-06 22:56:56