简体   繁体   English

用于从非结构化 JSON 文件中提取所有特定键值的 Bash 脚本

[英]Bash script to extract all specific key values from a unstructured JSON file

I was trying to extract all the values from a specific key in the below JSON file.我试图从以下 JSON 文件中的特定键中提取所有值。

{
  "tags": [
    {
      "name": "xxx1",
      "image_id": "yyy1"
    },
    {
      "name": "xxx2",
      "image_id": "yyy2"
    }
  ]
}

I used the below code to get the image_id key values.我使用下面的代码来获取 image_id 键值。

echo new.json | jq '.tags[] | .["image_id"]'

I'm getting the below error message.我收到以下错误消息。

parse error: Invalid literal at line 2, column 0

I think either the JSON file is not in the proper format OR the echo command to call the Json file is wrong.我认为要么 JSON 文件格式不正确,要么调用 Json 文件的 echo 命令是错误的。

Given the above input, my intended/desired output is:鉴于上述输入,我的预期/期望输出是:

yyy1
yyy2

What needs to be fixed to make this happen?需要修复什么才能实现这一点?

When you run: 当你运行:

echo new.json | jq '.tags[] | .["image_id"]'

...the string new.json -- not the contents of the file named new.json -- is fed to jq 's stdin, and is thus what it tries to parse as JSON text. ...字符串new.json - 不是名为new.json的文件的内容 - 被送到jq的stdin,因此它试图解析为JSON文本。

Instead, run: 相反,运行:

jq -r '.tags[] | .["image_id"]' <new.json

...to directly open new.json connected to the stdin of jq (and, with -r , to avoid adding unwanted quotes to the output stream). ...直接打开连接到jq的stdin的new.json (和-r ,以避免在输出流中添加不需要的引号)。

also, you may wanna try an alternative approach to your ask - using a walk-path unix tool for JSON: jtc . 此外,您可能想尝试一种替代方法来使用JSON: jtc的walk-path unix工具。 With that one your ask would look like this: 有了那个你的问题看起来像这样:

bash $ <new.json jtc -w'[tags][:][image_id]'
"yyy1"
"yyy2"
bash $ 

However, your new.json is not unstructured, oppositely it's well-structured. 但是,你的new.json不是非结构化的,相反它的结构很好。 if your new.json was indeed irregular (unstructured), then the following query would work better: 如果你的new.json确实是不规则的(非结构化),那么以下查询将更好地工作:

bash $ <new.json jtc -w'<image_id>l:'
"yyy1"
"yyy2"
bash $ 
  1. Your filter .tags[] | .["image_id"] 你的过滤器.tags[] | .["image_id"] .tags[] | .["image_id"]

is valid, but can be abbreviated to: 是有效的,但可以缩写为:

.tags[] | .image_id

or even: 甚至:

.tags[].image_id
  1. If you want the values associated with the "image_id" key, wherever that key occurs, you could go with: 如果您想要与“image_id”键关联的值,无论该键出现在何处,您都可以使用:

    .. | objects | select(has("image_id")) | .image_id

Or, if you don't mind throwing away false and null values: 或者,如果您不介意丢弃false和null值:

.. | .image_id? // empty

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM