简体   繁体   English

如何使用`jq`通过嵌套值过滤这个json并打印父键标识符?

[英]How to filter this json by nested value using `jq` and print the parent key identifier?

Suppose I have this json假设我有这个 json

{
  "sha256:0085b5379bf1baeb4a430128782440fe636938aa739f6a5ecc4152a22f19b08b": {
    "imageSizeBytes": "596515805",
    "layerId": "",
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "tag": [
      "python-3-toolchain-0.1.2"
    ],
    "timeCreatedMs": "1564631021992",
    "timeUploadedMs": "1564631067325"
  },
  "sha256:1ec7631f74a3d6d37bf9194c13854f33315260ae1f27347263dd0a8974ee82bb": {
    "imageSizeBytes": "513574770",
    "layerId": "",
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "tag": [
      "python-2-toolchain-latest"
    ],
    "timeCreatedMs": "1535447023647",
    "timeUploadedMs": "1535447042373"
  }
}

I want to select the image information (as well as the sha256 digest) with certain tag.我想选择带有特定标签的图像信息(以及 sha256 摘要)。 Example: I want to select only tag == "python-2-toolchain-latest" , so it prints this json (with json reformat)示例:我只想选择tag == "python-2-toolchain-latest" ,因此它会打印此 json(使用 json 重新格式化)

 {
    "digest": "sha256:1ec7631f74a3d6d37bf9194c13854f33315260ae1f27347263dd0a8974ee82bb",
    "imageSizeBytes": "513574770",
    "layerId": "",
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "tag": [
      "python-2-toolchain-latest"
    ],
    "timeCreatedMs": "1535447023647",
    "timeUploadedMs": "1535447042373"
  }

I have tried various approach, and being stuck at how to reference the sha256 key information.我尝试了各种方法,并被困在如何引用 sha256 密钥信息上。

A possible jq program to accomplish your goal:一个可能的jq程序来实现你的目标:

# Embed the final result into an array to get a valid JSON output
[
    # Convert the input object into a list of { key, value } objects
    to_entries[]

    # Keep only the objects that contain the desired tag
    # The .tag field may contain multiple tags and the desired one can be at any position
    | select(.value.tag | contains(["python-2-toolchain-latest"]))

    # Add the key into the value object into the .digest property
    | .value.digest = .key

    # Keep only the values (the modified objects)
    | .value

# That's all, folks
]

Try it online!在线试试吧!

Here's a straightforward and concise but efficient solution:这是一个简单明了但有效的解决方案:

keys_unsorted[] as $k
| .[$k] as $value
| select($value.tag[0] ==  "python-2-toolchain-latest")
| {digest: $k} + $value

here is what i worked up.这是我的工作。 i assumed that the tag array could contain more than a single entry...我认为标签数组可以包含多个条目......

.
|to_entries[]
|.key as $k
|.value as $v
|.value.tag[]
|select(.=="python-2-toolchain-latest")
[ { "digest": ($k) }, $v ] | add

after seeing the peak answer, I like the last line like this better:看到高峰答案后,我更喜欢这样的最后一行:

[ { "digest": ($k) } + $v ]

If the tag can occur twice, then this would output the same record twice.如果标签可以出现两次,那么这将输出相同的记录两次。 there must be a better way to simply check if "python-2-toolchain-latest" is in the tag[] array.必须有更好的方法来简单地检查“python-2-toolchain-latest”是否在 tag[] 数组中。 My jq foo is not strong enough.我的 jq foo 不够强大。

for those, who are open to alternatives, here's how the same JSON manipulation is achievable with a walk-path based unix utility jtc : 对于那些对替代品持开放态度的人,这是使用基于步行路径的unix实用程序jtc可以实现相同的JSON操作:

bash $ <file.json jtc -w'[:]<D>k<tag>l<python-2-toolchain-latest>[-2]' -T'{"digest":{D}",{}}'
{
   "digest": "sha256:1ec7631f74a3d6d37bf9194c13854f33315260ae1f27347263dd0a8974ee82bb",
   "imageSizeBytes": "513574770",
   "layerId": "",
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "tag": [
      "python-2-toolchain-latest"
   ],
   "timeCreatedMs": "1535447023647",
   "timeUploadedMs": "1535447042373"
}

Note: lexeme <tag>l is not required if searchable context in the next lexeme uniquely identifies the tag. 注意:如果下一个词素中的可搜索上下文唯一标识该标记,则不需要词素<tag>l if so, then the lexeme could be omitted 如果是这样,则可以省略词素

PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations PS>披露:我是jtc的创建者-用于JSON操作的shell cli工具

I would use我会用

.
| to_entries
| .[]
| select(.value.tag | contains(["python-2-toolchain-latest"]))
| { digest: .key } + .value
  • I put your example data in a file I named pratama-1.json .我将您的示例数据放在一个名为pratama-1.json的文件中。
  • I ran the following command and got this output.我运行了以下命令并得到了这个输出。
     $ jq '. | to_entries | .[] | select(.value.tag | contains(["python-2-toolchain-latest"])) | { digest: .key } + .value' pratama-1.json { "digest": "sha256:1ec7631f74a3d6d37bf9194c13854f33315260ae1f27347263dd0a8974ee82bb", "imageSizeBytes": "513574770", "layerId": "", "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "tag": [ "python-2-toolchain-latest" ], "timeCreatedMs": "1535447023647", "timeUploadedMs": "1535447042373" }

Breaking this down打破这个

You can think of a jq program as analogous to a sh pipeline, except: - with sh (or bash ), each stage of the pipeline (that is, each command between | s) has a stream of bytes (often just text) as its input and output;您可以将jq程序视为类似于sh管道,除了: - 使用sh (或bash ),管道的每个阶段(即| s 之间的每个命令)都有一个字节流(通常只是文本)作为它的输入和输出; - with jq , each stage has a stream of JSON values as its inputs and outputs. - 使用jq ,每个阶段都有一个JSON 值流作为其输入和输出。

  1. The simplest jq program is simply .最简单的jq程序就是. :
    • If the input is the JSON value true , the output is true .如果输入为 JSON 值true ,则输出为true
    • If the input is the JSON value [ true, 3.1416, "foo" ] (an array value), the output is the same.如果输入是 JSON 值[ true, 3.1416, "foo" ] (数组值),则输出相同。
    • If the inputs are the three JSON values true , 3.1416 , and "foo" (a boolean, a number, and a string), then the three outputs will be that boolean, that number, and that string.如果输入是三个 JSON 值true3.1416"foo" (一个布尔值、一个数字和一个字符串),那么三个输出将是那个布尔值、那个数字和那个字符串。
  2. At the beginning of my script, the .在我的脚本的开头, . just represents the input value, which in this case is that JSON object you included in your question.仅表示输入值,在本例中是您在问题中包含的 JSON 对象。
  3. The next stage is to_entries :下一阶段是to_entries
    • It converts a JSON object into a JSON array of JSON objects.它将 JSON 对象转换为 JSON 对象的 JSON 数组。 For an input like:对于像这样的输入:
       { "a": 3.1416, "b": false }
      into an array like this:变成这样的数组:
       [ { "key": "a", "value": 3.1416 }, { "key": "b", "value": false } ]
  4. The next stage is .[] , which is a jq operator that turns one JSON value into many:下一阶段是.[] ,它是一个jq运算符,可将一个 JSON 值转换为多个值:
    • If the input is a single JSON array like [ true, 3.1416, "foo" ] , the three ouptuts are the JSON values true , 3.1416 , and "foo" .如果输入是单个 JSON 数组,如[ true, 3.1416, "foo" ] ,则三个输出是 JSON 值true3.1416"foo"
    • In our case, it unwraps the JSON array around all those key-value objects so we can many output values instead of one JSON array output value.在我们的例子中,它围绕所有这些键值对象展开 JSON 数组,因此我们可以使用多个输出值而不是一个 JSON 数组输出值。
  5. The next stage is the select(…) :下一阶段是select(…)
    • For each input value, it evaluates the expression in its parentheses and "passes along" that input as an output if the expression is true.对于每个输入值,它计算括号中的表达式,如果表达式为真,则将该输入作为输出“传递”。
    • For example, for the three inputs true , 3.1416 , and "foo" , select(type == "string") would have only one ouput: the string "foo" .例如,对于三个输入true3.1416"foo"select(type == "string")将只有一个输出:字符串"foo"
    • My select(…) has two inputs:我的select(…)有两个输入:
      1. A JSON object { "key": "sha256:008…", "value": { "imageSizeBytes": … } } AND一个 JSON 对象{ "key": "sha256:008…", "value": { "imageSizeBytes": … } } AND
      2. A JSON object { "key" :"sha256:1ec…", "value": { "imageSizeBytes": … } } .一个 JSON 对象{ "key" :"sha256:1ec…", "value": { "imageSizeBytes": … } }
    • Inside my select(…) , I use a subexpression that is itself a jq pipeline: .value.tag | contains(["python-2-toolchain-latest"])在我的select(…) ,我使用了一个本身就是jq管道的子表达式: .value.tag | contains(["python-2-toolchain-latest"]) .value.tag | contains(["python-2-toolchain-latest"]) : .value.tag | contains(["python-2-toolchain-latest"])
      1. The first part .value.tag yields the value of the field with key "tag" in each of those objects.第一部分.value.tag产生每个对象中带有键"tag"的字段的值。 For your example data, each value is a JSON array.对于您的示例数据,每个值都是一个 JSON 数组。
      2. The contains([…]) part evalues to true for input JSON array values if all the values in its argument JSON array […] are members of that input JSON array value.如果其参数 JSON 数组[…]中的所有值都是该输入 JSON 数组值的成员,则contains([…])部分对输入 JSON 数组值的评估值为true
        • Some examples一些例子
        $ jq '. | contains([ "foo" ])' <<< '[ true, 3.1416, "foo" ]' true $ jq '. | contains([ "foo", true ])' <<< '[ true, 3.1416, "foo" ]' true $ jq '. | contains([ "foo", true, null ])' <<< '[ true, 3.1416, "foo" ]' false $ jq '. | contains([ "foo", true, false ])' <<< '[ true, 3.1416, "foo" ]' false $ jq '. | contains([ "foo", true, null ])' <<< '[ true, 3.1416, "foo" ]' false
    • So, the expression in my select(…) evaluates to true for each imput JSON object that has a key named "value" with a JSON sub-object value that has a key named "tag" with a JSON array value that contains an element with the JSON string value "python-2-toolchain-latest".所以,在我的表达select(…)的计算结果为true为有一个名为键中的每个开关输入JSON对象"value"与已经一键命名为JSON子对象的值"tag"与包含元素的JSON数组值使用 JSON 字符串值“python-2-toolchain-latest”。
    • For each input where the expression inside select(…) is true , that input value becomes one of the output values.对于select(…)中的表达式为true每个输入,该输入值成为输出值之一。
    • For your example data, this is only the second sub-object: { "key": "sha256:1ec…", "value": { … } } .对于您的示例数据,这只是第二个子对象: { "key": "sha256:1ec…", "value": { … } }
  6. The last stage of my pipeline is a jq expression that looks like a JSON object.我的管道的最后一个阶段是一个jq表达式,它看起来像一个 JSON 对象。
    • { digest: .key } says for each input value, output a JSON object value with: { digest: .key }表示对于每个输入值,输出一个 JSON 对象值:
      • a key named "digest" AND一个名为"digest"的键AND
      • the value of that key should be the value associated with the input JSON object's key named "key" .该键的值应该是与名为"key"的输入 JSON 对象的键关联的值。
      • Since our input is the JSON object { "key": "sha256:1ec…", … } , this would give us the output JSON object { "digest": "sha256:1ec…" } .由于我们的输入是 JSON 对象{ "key": "sha256:1ec…", … } ,这将为我们提供输出 JSON 对象{ "digest": "sha256:1ec…" }
    • But we want other stuff in our output JSON object too: we want to add in the JSON object value associated with the key "value" in our input object.但是我们也希望在输出 JSON 对象中包含其他内容:我们希望添加与输入对象中的键"value"相关联的 JSON 对象值。 We get that by adding + .value .我们通过添加+ .value获得它。
    • The jq + operator when used with JSON object values "merges" its operands JSON objects into a single output JSON object, eg { "a": true } + { "b": 3.1416 } yields { "a": true, "b": 3.1416 } . jq +运算符与 JSON 对象值一起使用时将其操作数 JSON 对象“合并”为单个输出 JSON 对象,例如{ "a": true } + { "b": 3.1416 } yields { "a": true, "b": 3.1416 }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM