简体   繁体   中英

jq: Remove items from array that match a condition based on other items in the same array

I need help with a somewhat complex jq query.

Given

[{
  "type": "ITEM_PURCHASED",
  "timestamp": 1710829,
  "participantId": 2,
  "itemId": 3089
},
{
  "type": "ITEM_PURCHASED",
  "timestamp": 1711620,
  "participantId": 7,
  "itemId": 2055
},
{
  "type": "ITEM_PURCHASED",
  "timestamp": 1711621,
  "participantId": 7,
  "itemId": 1058
},
{
  "type": "ITEM_PURCHASED",
  "timestamp": 1714435,
  "participantId": 9,
  "itemId": 1037
},
{
  "type": "ITEM_UNDO",
  "timestamp": 1716107,
  "participantId": 7,
  "afterId": 0,
  "beforeId": 2055
},
{
  "type": "ITEM_UNDO",
  "timestamp": 1716272,
  "participantId": 7,
  "afterId": 0,
  "beforeId": 1058
},
{
  "type": "ITEM_PURCHASED",
  "timestamp": 1718091,
  "participantId": 7,
  "itemId": 1026
}]

Desired output:

[{
  "type": "ITEM_PURCHASED",
  "timestamp": 1710829,
  "participantId": 2,
  "itemId": 3089
},
{
  "type": "ITEM_PURCHASED",
  "timestamp": 1714435,
  "participantId": 9,
  "itemId": 1037
},
{
  "type": "ITEM_PURCHASED",
  "timestamp": 1718091,
  "participantId": 7,
  "itemId": 1026
}]

I would like to filter this array and remove all the purchased items that were "undone". A PURCHASE_ITEM object can be undone by adding an ITEM_UNDONE object after it with a higher timestamp, a matching participantId and beforeId==itemId.

I tried the following approach:

  1. Collect all the ITEM_UNDO objects
  2. Find all corresponding ITEM_PURCHASED objects
  3. Subtract both from the original list

Step 2 is giving me trouble. I have the following code so far which does not work:

jq '
map(select(.type=="ITEM_UNDO")) as $undos |
 [
    {
      undo: $undos[],
      before_purchases: map( select(.type=="ITEM_PURCHASED"
                                    and .itemId == $undos[].beforeId
                                    and .participantId == $undos[].participantId
                                    )

                           )
    }
  ] as $undo_with_purchased | $undo_with_purchased
'

I know why it doesn't work because in the line

and .itemId == $undos[].beforeId
and .participantId == $undos[].participantId

$undos is expanded twice independently rather than using the same instance for every comparison and then a third time in

undo: $undos[],

I can't seem to find a good way to force jq to iterate over $undos only once and use the same instance for all comparisons. In general I'm having issues iterating over multiple arrays at the same time, performing operations. This would be a no brainer in any procedural language but what's the best way to do this kind of stuff in jq?

Thanks for any suggestions!

First, let's define a filter that will tell if an item in the array has been "undone" by a subsequent (in the array and in time) item. This is straightforward to do using any/2 :

# input: the entire array
# output: true iff item n is "undone" by a subsequent item
def undone($n):
  . as $in
  | length as $length
  | .[$n] as $nth
  | if $nth.type != "ITEM_PURCHASED" then false
    else any( range($n+1; $length) | $in[.]; 
              .type == "ITEM_UNDO"
              and .participantId == $nth.participantId
              and .beforeId== $nth.itemId
              and .timestamp > $nth.timestamp)
    end;

Now the query is quite straightforward:

[ range(0;length) as $i
  | select( (.[$i].type == "ITEM_PURCHASED") and (undone($i) | not) )
  | .[$i] ]

Invocation: jq -f program.jq data.json

Output: an array with the three items.

Style

One can write:

range($n+1; $length) | $in[.]

more compactly, and perhaps more idiomatically, as:

$in[range($n+1; $length)]

In fact, both $in and $length can be dispensed with altogether, so the snippet in question would become simply:

.[range($n+1; length)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM