简体   繁体   English

使用 jq 在 JSON 对象中查找公共键

[英]Find common keys in JSON objects using jq

I'm trying to find all common keys in a Json file, given that we don't know names of keys in the file.鉴于我们不知道文件中键的名称,我正在尝试在 Json 文件中查找所有常用键。

the Json file looks like: Json 文件如下所示:

{
   "DynamicKey1" : {
    "foo" : 1,
    "bar" : 2

   },
   "DynamicKey2" : {
     "bar" : 3

   },
   "DynamicKey3" : {
     "foo" : 5,
     "zyx" : 5

   }   
}

Expect result:期待结果:

{
 "foo"
}

I was trying to apply reduce/foreach logic here but I am not sure how to write it in jq.我试图在这里应用reduce/foreach逻辑,但我不确定如何在jq中编写它。 I appreciate any help!!我感谢任何帮助!

jq '. as $ss | reduce range(1; $ss|length) as $i ([]; . + reduce ($ss[i] | keys) as $key ([]; if $ss[$i - 1] | has($key) then . +$key else . end))' file.json

There are some inconsistencies in the Q as posted: there are no keys common to all the objects, and if one looks at the pair-wise intersection of keys, the result would include both "foo" and "bar". Q中存在一些不一致之处:没有所有对象共有的键,并且如果查看键的成对相交,则结果将同时包含“ foo”和“ bar”。

In the following, I'll present solutions for both these problems. 在下文中,我将为这两个问题提供解决方案。

Keys in more than one object 键入多个对象

[.[] | keys_unsorted[]] | group_by(.)[] | select(length>1)[0]

Keys in all the objects 键入所有对象

Here's a solution using a similar approach: 这是使用类似方法的解决方案:

length as $length
| [.[] | keys_unsorted[]] | group_by(.)[]
| select(length==$length) 
| .[0]

This involves group_by/2 , which is implemented using a sort. 这涉及group_by/2 ,它是使用sort实现的。

Here is an alternative approach that relies on the built-in function keys to do the sorting (the point being that ((nk ln(nk)) - n(k ln(k))) = nk ln(n), ie having n small sorts of k items is better than one large sort of n*k items): 这是一种依赖于内置功能keys进行排序的替代方法(关键是(((nk ln(nk))-n(k ln(k)))= nk ln(n),即n个小类别的k个项目优于一个大类别的n * k个项目):

# The intersection of an arbitrary number of sorted arrays
def intersection_of_sorted_arrays:
  # intersecting/1 returns a stream
  def intersecting($A;$B):
    def pop:
    .[0] as $i
    | .[1] as $j
    | if $i == ($A|length) or $j == ($B|length) then empty
      elif $A[$i] == $B[$j] then $A[$i], ([$i+1, $j+1] | pop)
      elif $A[$i] <  $B[$j] then [$i+1, $j] | pop
      else [$i, $j+1] | pop
      end;
    [0,0] | pop;
   reduce .[1:][] as $x (.[0]; [intersecting(.; $x)]);

To compute the keys common to all the objects: 要计算所有对象共有的键:

[.[] | keys] | intersection_of_sorted_arrays

Here is a sort-free and time-efficient answer that relies on the efficiency of jq's implementation of lookups in a JSON dictionary. 这是一个无排序且省时的答案,它依赖于jq在JSON字典中实现查找的效率。 Since keys are strings, we can simply use the concept of a "bag of words" ( bow ): 由于键是字符串,因此我们可以简单地使用“单词袋”( bow )的概念:

def bow(stream): 
  reduce stream as $word ({}; .[$word|tostring] += 1);

We can now solve the "Keys common to all objects" problem as follows: 现在,我们可以解决“所有对象共有的键”的问题,如下所示:

length as $length
| bow(.[] | keys_unsorted[])
| to_entries[]
| select(.value==$length).key

And similarly for the "Keys in more than one object" problem. 与类似的“键入多个对象”的问题类似。

Of course, to achieve the time-efficiency, there is the usual space-time tradeoff. 当然,要实现时间效率,需要进行通常的时空折衷。

alternatively, the same JSON operation could be achieved in two steps using a walk-path unix utility jtc : 或者,可以使用步行路径unix实用程序jtc分两步实现相同的JSON操作:

bash $ <file.json jtc -w'<.*>L:<>k' -j | jtc -w'<.>Q:' -j
[
   "bar",
   "foo"
]
bash $ 
  • in the first step it lists all the labels (as a JSON array) 在第一步中,它列出了所有标签(作为JSON数组)
  • in the second step it finds all the duplicate records 在第二步中,找到所有重复的记录

PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations PS>披露:我是jtc的创建者-用于JSON操作的shell cli工具

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM