如何使用 jq 获取两个 JSON 数组的交集

Question

Given arrays X and Y (preferably both as inputs, but otherwise, with one as input and the other hardcoded), how can I use jq to output the array containing all elements common to both?给定数组 X 和 Y（最好都作为输入，否则，一个作为输入，另一个硬编码），如何使用 jq 输出包含两者共有的所有元素的数组？ eg what is a value of f such that例如，f 的值是多少，使得

echo '[1,2,3,4]' | jq 'f([2,4,6,8,10])'

would output会输出

[2,4]

? ?

I've tried the following:我尝试了以下方法：

map(select(in([2,4,6,8,10])))  --> outputs [1,2,3,4]
select(map(in([2,4,6,8,10])))  --> outputs [1,2,3,4,5]

Answer 1

A simple and quite fast (but somewhat naive) filter that probably does essentially what you want can be defined as follows:一个简单且相当快速（但有点幼稚）的过滤器可能基本上可以满足您的需求，可以定义如下：

   # x and y are arrays
   def intersection(x;y):
     ( (x|unique) + (y|unique) | sort) as $sorted
     | reduce range(1; $sorted|length) as $i
         ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;

If x is provided as input on STDIN, and y is provided in some other way (eg def y: ... ), then you could use this as: intersection(.;y)如果 x 在 STDIN 上作为输入提供，而 y 以其他方式提供（例如def y: ... ），那么您可以将其用作： intersection(.;y)

Other ways to provide two distinct arrays as input include:提供两个不同数组作为输入的其他方法包括：

using the --slurp option使用--slurp选项
using --arg av (or --argjson av if available in your jq)使用--arg av （或--argjson av如果在您的 jq 中可用）

Here's a simpler but slower def that's nevertheless quite fast in practice:这是一个更简单但速度较慢的定义，但在实践中却相当快：

    def i(x;y):
       if (y|length) == 0 then []
       else (x|unique) as $x
       | $x - ($x - y)
       end ;

Here's a standalone filter for finding the intersection of arbitrarily many arrays:这是一个用于查找任意多个数组的交集的独立过滤器：

# Input: an array of arrays
def intersection:
  def i(y): ((unique + (y|unique)) | sort) as $sorted
  | reduce range(1; $sorted|length) as $i
       ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
  reduce .[1:][] as $a (.[0]; i($a)) ;

Examples:例子：

[ [1,2,4], [2,4,5], [4,5,6]] #=> [4]
[[]]                         #=> []
[]                           #=> null

Of course if x and y are already known to be sorted and/or unique, more efficient solutions are possible.当然，如果已知x和y已排序和/或唯一，则可能有更有效的解决方案。 See in particular Finite Sets of JSON Entities特别参见JSON 实体的有限集

Answer 2

Simple Explanation简单说明

These complexity of all these answers obscured understanding the principle.所有这些答案的复杂性掩盖了对原理的理解。 That's unfortunate because the principle is simple:这很不幸，因为原理很简单：

array1 minus array2 returns: array1 减去 array2 返回：

everything that's left in array1 array1 中剩下的所有内容

after removing everything that is in array2删除 array2 中的所有内容后

(and discarding the rest of array2) （并丢弃 array2 的其余部分）

Simple Demo简单演示

# From array1, subtract array2, leaving the remainder
$ jq --null-input '[1,2,3,4] - [2,4,6,8]'
[
  1,
  3
]

# Subtract the remainder from the original
$ jq --null-input '[1,2,3,4] - [1,3]'
[
  2,
  4
]

# Put it all together
$ jq --null-input '[1,2,3,4] - ([1,2,3,4] - [2,4,6,8])'
[
  2,
  4
]

`comm` Demo `comm`演示

def comm:
  (.[0] - (.[0] - .[1])) as $d |
    [.[0]-$d, .[1]-$d, $d]
;

With that understanding, I was able to imitate the behavior of the *nix comm command有了这种理解，我就能够模仿*nix comm命令的行为

With no options, produce three-column output.在没有选项的情况下，生成三列输出。 Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.第一列包含 FILE1 独有的行，第二列包含 FILE2 独有的行，第三列包含两个文件共有的行。

$ echo 'def comm: (.[0]-(.[0]-.[1])) as $d | [.[0]-$d,.[1]-$d, $d];' > comm.jq
$ echo '{"a":101, "b":102, "c":103, "d":104}'                        > 1.json
$ echo '{         "b":202,          "d":204, "f":206, "h":208}'      > 2.json

$ jq --slurp '.' 1.json 2.json
[
  {
    "a": 101,
    "b": 102,
    "c": 103,
    "d": 104
  },
  {
    "b": 202,
    "d": 204,
    "f": 206,
    "h": 208
  }
]

$ jq --slurp '[.[] | keys | sort]' 1.json 2.json
[
  [
    "a",
    "b",
    "c",
    "d"
  ],
  [
    "b",
    "d",
    "f",
    "h"
  ]
]

$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm' 1.json 2.json
[
  [
    "a",
    "c"
  ],
  [
    "f",
    "h"
  ],
  [
    "b",
    "d"
  ]
]

$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm[2]' 1.json 2.json
[
  "b",
  "d"
]

Answer 3

Here is a solution which works by counting occurrences of elements in the arrays using foreach这是一个解决方案，它通过使用foreach计算数组中元素的出现次数来工作

[
  foreach ($X[], $Y[]) as $r (
    {}
  ; .[$r|tostring] += 1
  ; if .[$r|tostring] == 2 then $r else empty end
  )
]

If this filter is in filter.jq then如果此过滤器在filter.jq则

jq -M -n -c --argjson X '[1,2,3,4]' --argjson Y '[2,4,6,8,10]' -f filter.jq

will produce会产生

[2,4]

It assumes there are no duplicates in the initial arrays.它假设初始数组中没有重复项。 If that's not the case then it is easy to compensate with unique .如果不是这种情况，那么很容易用unique进行补偿。 Eg例如

[
  foreach (($X|unique)[], ($Y|unique)[]) as $r (
    {}
  ; .[$r|tostring] += 1
  ; if .[$r|tostring] == 2 then $r else empty end
  )
]

Answer 4

$ echo '[1,2,3,4] [2,4,6,8,10]' | jq --slurp '[.[0][] as $x | .[1][] | select($x == .)]'
[
  2,
  4
]

如何使用 jq 获取两个 JSON 数组的交集

问题描述

4 个解决方案

解决方案1
5 已采纳 2016-07-14 03:50:43

解决方案2
4 2020-12-23 00:06:14

Simple Explanation简单说明

Simple Demo简单演示

`comm` Demo `comm`演示

解决方案3
0 2017-08-28 01:11:36

解决方案4
0 2018-05-28 09:25:02

如何使用 jq 获取两个 JSON 数组的交集

问题描述

4 个解决方案

解决方案1 5 已采纳 2016-07-14 03:50:43

解决方案2 4 2020-12-23 00:06:14

Simple Explanation简单说明

Simple Demo简单演示

comm Demo comm演示

解决方案3 0 2017-08-28 01:11:36

解决方案4 0 2018-05-28 09:25:02

解决方案1
5 已采纳 2016-07-14 03:50:43

解决方案2
4 2020-12-23 00:06:14

`comm` Demo `comm`演示

解决方案3
0 2017-08-28 01:11:36

解决方案4
0 2018-05-28 09:25:02