简体   繁体   中英

jq filter: ok if item not in list

(jq newbie here, sorry if this question has an obvious answer:) )

I'd like to filter json based on whether a value is not in a list.

Here's a concrete example:

Input

[ 
    {
        "n": "A",
        "a": 659533330984,
        "vals": {
            "n2": "B",
            "b": 5193941030
        }
     },
    {
        "n": "A",
        "a": 659533330984,
        "vals": {
            "n2": "C",
            "b": 4872891707
        }
    },
    {
        "n": "B",
        "a": 659533330984,
        "vals": {
            "n2": "C",
            "b": 4872891707
        }
    }
]

Filter

[.n, .vals.n2] not in (["A", "B"], ["B", "C"])

Hence, in jq, I tried the following commands (also based on this related question )

jq '[ .[] | select([.n, .vals.n2] as $i | (["A", "B"], ["B", "C"]) | index($i) | not )]'

and

jq '[ .[] | select([.n, .vals.n2] != (["A", "B"], ["B", "C"]))]'

However, the both commands give the output

[
  {
    "n": "A",
    "a": 659533330984,
    "vals": {
      "n2": "B",
      "b": 5193941030
    }
  },
  {
    "n": "A",
    "a": 659533330984,
    "vals": {
      "n2": "C",
      "b": 4872891707
    }
  },
  {
    "n": "A",
    "a": 659533330984,
    "vals": {
      "n2": "C",
      "b": 4872891707
    }
  },
  {
    "n": "B",
    "a": 659533330984,
    "vals": {
      "n2": "C",
      "b": 4872891707
    }
  }
]

whereas this would be the desired output -- without duplicates and with logical AND of all "blacklisted" values:

[
  {
    "n": "A",
    "a": 659533330984,
    "vals": {
      "n2": "C",
      "b": 4872891707
    }
  }
]

It makes sense that the second command does not work, since if I understood correctly, the comma operator basically means that jq evaluates the expression once for every listed element - hence the duplicates. However simply piping through unique does not help since the output should not contain any of the filter pairs.

The only other idea I have at the moment is to pipe select through select through select... for each item in the "blacklist". However, I'd like to read the blacklist as an input -- I could dynamically create the command, but I was wondering whether there is a more beautiful solution? It feels like as if there must be...

I'd be very happy to hear your input on how to approach this best.

I'm using jq version jq-1.5-1-a5b5cbe.

The expression:

([.n, .vals.n2]) not in (["A", "B"], ["B", "C"])

would be equivalent to:

([.n, .vals.n2]) != ["A", "B"] and ([.n, .vals.n2]) != ["B", "C"]

As you have it here:

select([.n, .vals.n2] != (["A", "B"], ["B", "C"]))

it's not quite the same as the comma effectively makes it an or .

You'll need to do something more like this:

select([.n, .vals.n2] as $v | $v != ["A", "B"] and $v != ["B", "C"])

or

select([.n, .vals.n2] as $v | all(["A", "B"], ["B", "C"]; $v != .))

Also if you wanted to stick with your first approach, you would have to put the values in an array and not just separated by a comma.

select([.n, .vals.n2] as $i | [["A", "B"], ["B", "C"]] | index($i) | not)

When using index to find the index of an array (say $x), you have to write:

index([$x])

(This has to do with the fact that index is designed to work in a uniform way on both JSON strings and arrays.)

An efficient solution

[["A", "B"], ["B", "C"]] as $blacklist
| map( [.n, .vals.n2] as $i
       | select( $blacklist | index([$i]) | not) )

From the jq FAQ

: Given an array, A, containing an item, X, how can I find the least index of X in A? Why does [ 1 ] | index( 1 ) return null rather than 0? Why does [1,2] | index([1,2]) return 0 rather than null?

A: The simplest uniform method for finding the least index of X in an array is to query for [X] rather than X itself, that is: index([X]).

By contrast, the filter index([1,2]) attempts to find [1,2] as a subsequence of contiguous items in the input array. This is for uniformity with the behavior of t | index(s) where s and t are strings.

If X is not an array, then index([X]) may be abbreviated to index(X).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM