簡體   English   中英

從 bash 腳本中的大型 JSON 輸出中提取字符串到數組中

[英]Extract strings from a large JSON output in bash script into an array

json 文件在同一個目錄中我的腳本旁邊,並在開頭附近包含一個“帶有值的標題字段”,請參閱示例 JSON 中的“Item1、Item2 和 Item3”。 在每個項目部分的末尾,它都有一個“服務”部分,並且它所關聯的每個服務都有一個標題。 根據原始項目標題,服務可以有多少,也可以沒有。

我想要做的就是搜索單個服務標題...即“STTitle5” 如果它存在...只需將項目的主標題即 Item3 彈出到數組中。 根據下面的示例 JSON 和剛剛給出的示例,只有 Item1 和 Item3 會被添加到數組中。

我試過用 regex 用多種不同的方式進行 grepping,但似乎無法弄清楚如果在它之后找到某些東西,如何返回並獲取特定的字符串。 JSON 中可以有數千個條目。 我真的不需要它的任何其他東西,所以我認為將 JSON 解析為文本直接將是最簡單的方法。

    [{"id_name": "Item1", "informational": {"values": ["werwe", "werwwe", "8", "ewrwrw", "werewrew", "64432.5390625", "64432.55859375", "64432.36328125", "werw werwerw", "2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "fields": ["werwerw", "erwrwr", "wewrewrer", "werrwrwer", "werwerwrw", "werwerewr", "werwrwr", "stuff", "vendor_product", "version"]}, "role": ["Application Server"], "cpu_cores": ["8"], "create_time": "2017-04-03 16:32:27.738432", "mod_timestamp": "2019-06-26T01:17:23.933103+00:00", "title": "Item1", "family": ["dfsfd"], "OS": ["dfdsfsf"], "sdfdsfdsds": "fdsfdsf", "dsfdsfsd": ["64432.5390625", "64432.55859375", "64432.36328125"], "host": ["dfdsfsdfsdfds"], "sdfdsfds": "sdfdsf", "vend": ["sdada"], "permissions": {"delete": true, "write": true, "user": "dsdsds", "group": {"delete": true, "write": true, "read": true}, "read": true}, "sdsdsdsdsds": ["dfsdfdsfdsfsd"], "_version": "3", "sgrp": "default", "object_type": "dfsfs", "mod_by": "user", "mod_time": "2019-06-25 13:09:47.543535", "_user": "user", "environment": ["dfsdfdfsd"], "description": "", "identifier": {"values": ["dfsdfdfdsffdfsdfs"], "fields": ["host"]}, "sdfdsfsfds": ["SMP"], "role": ["operating_system_host"], "mod_source": "REST", "_key": "afderea-be2d-47a6-9f0d-00857ereef6c", "version": ["2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "create_source": "unknown", "services": [{"title": "STitle", "_key": "865defee-d47f-4b8f-9435-bc4ere89e9b1f8d"}, {"title": "STitle2", "_key": "d9d5e231-3841-4376-a295-ea5fere95168482"}, {"title": "STitle3", "_key": "38165ff4-9da6-df-9a8b-a162aa7a68e8"}, {"title": "S

", "_key": "e2adb75e-9254-4774-b735-"}, {"title": "STtitle6", "_key": "381f54d0-d759-43a3-94b3"}, {"title": "STtitle7" , "_key": "8253-f2b6a1d6f836"}, {"title": "STtitle8", "_key": "bc69692b-48d8-4bd7-b62b"}]},

    {"id_name": "Item2", "informational": {"values": ["werwe", "werwwe", "8", "ewrwrw", "werewrew", "64432.5390625", "64432.55859375", "64432.36328125", "werw werwerw", "2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "fields": ["werwerw", "erwrwr", "wewrewrer", "werrwrwer", "werwerwrw", "werwerewr", "werwrwr", "stuff", "vendor_product", "version"]}, "role": ["Application Server"], "cpu_cores": ["8"], "create_time": "2017-04-03 16:32:27.738432", "mod_timestamp": "2019-06-26T01:17:23.933103+00:00", "title": "Item2", "family": ["dfsfd"], "OS": ["dfdsfsf"], "sdfdsfdsds": "fdsfdsf", "dsfdsfsd": ["64432.5390625", "64432.55859375", "64432.36328125"], "host": ["dfdsfsdfsdfds"], "sdfdsfds": "sdfdsf", "vend": ["sdada"], "permissions": {"delete": true, "write": true, "user": "dsdsds", "group": {"delete": true, "write": true, "read": true}, "read": true}, "sdsdsdsdsds": ["dfsdfdsfdsfsd"], "_version": "3", "sgrp": "default", "object_type": "dfsfs", "mod_by": "user", "mod_time": "2019-06-25 13:09:47.543535", "_user": "user", "environment": ["dfsdfdfsd"], "description": "", "identifier": {"values": ["dfsdfdfdsffdfsdfs"], "fields": ["host"]}, "sdfdsfsfds": ["SMP"], "role": ["operating_system_host"], "mod_source": "REST", "_key": "afderea-be2d-47a6-9f0d-00857ereef6c", "version": ["2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "create_source": "unknown", "services": [{"title": "STitle", "_key": "865defee-d47f-4b8f-9435-bc4ere89e9b1f8d"}, {"title": "STitle2", "_key": "d9d5e231-3841-4376-a295-ea5fere95168482"}]}, 

    {"id_name": "Item3", "informational": {"values": ["werwe", "werwwe", "8", "ewrwrw", "werewrew", "64432.5390625", "64432.55859375", "64432.36328125", "werw werwerw", "2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "fields": ["werwerw", "erwrwr", "wewrewrer", "werrwrwer", "werwerwrw", "werwerewr", "werwrwr", "stuff", "vendor_product", "version"]}, "role": ["Application Server"], "cpu_cores": ["8"], "create_time": "2017-04-03 16:32:27.738432", "mod_timestamp": "2019-06-26T01:17:23.933103+00:00", "title": "Item3", "family": ["dfsfd"], "OS": ["dfdsfsf"], "sdfdsfdsds": "fdsfdsf", "dsfdsfsd": ["64432.5390625", "64432.55859375", "64432.36328125"], "host": ["dfdsfsdfsdfds"], "sdfdsfds": "sdfdsf", "vend": ["sdada"], "permissions": {"delete": true, "write": true, "user": "dsdsds", "group": {"delete": true, "write": true, "read": true}, "read": true}, "sdsdsdsdsds": ["dfsdfdsfdsfsd"], "_version": "3", "sgrp": "default", "object_type": "dfsfs", "mod_by": "user", "mod_time": "2019-06-25 13:09:47.543535", "_user": "user", "environment": ["dfsdfdfsd"], "description": "", "identifier": {"values": ["dfsdfdfdsffdfsdfs"], "fields": ["host"]}, "sdfdsfsfds": ["SMP"], "role": ["operating_system_host"], "mod_source": "REST", "_key": "afderea-be2d-47a6-9f0d-00857ereef6c", "version": ["2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "create_source": "unknown", "services": [{"title": "STitle", "_key": "865defee-d47f-4b8f-9435-bc4ere89e9b1f8d"}, {"title": "STitle2", "_key": "d9d5e231-3841-4376-a295-ea5fere95168482"}, {"title": "STitle3", "_key": "38165ff4-9da6-df-9a8b-a162aa7a68e8"}, {"title": "SSTitle5", "_key": "e2adb75e-9254-4774-b735-"}, {"title": "STitle6", "_key": "381f54d0-d759-43a3-94b3"}, {"title": "STitle7", "_key": "8253-f2b6a1d6f836"}, {"title": "STitle8", "_key": "bc69692b-48d8-4bd7-b62b"}]}]

編輯:可以使用 id_name 而不是 title

編輯:添加了更好的樣本數據

            {"id_name": "Item3", "informational": {"values": ["werwe", "werwwe", "8", "ewrwrw", "werewrew", "64432.5390625", "64432.55859375", "64432.36328125", "werw werwerw", "2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "fields": ["werwerw", "erwrwr", "wewrewrer", "werrwrwer", "werwerwrw", "werwerewr", "werwrwr", "stuff", "vendor_product", "version"]}, "role": ["Application Server"], "cpu_cores": ["8"], "create_time": "2017-04-03 16:32:27.738432", "mod_timestamp": "2019-06-26T01:17:23.933103+00:00", "title": "Item3", "family": ["dfsfd"], "OS": ["dfdsfsf"], "sdfdsfdsds": "fdsfdsf", "dsfdsfsd": ["64432.5390625", "64432.55859375", "64432.36328125"], "host": ["dfdsfsdfsdfds"], "sdfdsfds": "sdfdsf", "vend": ["sdada"], "permissions": {"delete": true, "write": true, "user": "dsdsds", "group": {"delete": true, "write": true, "read": true}, "read": true}, "sdsdsdsdsds": ["enttitle"=Item1, 'hostname=myHostname"], "_version": "3", "sgrp": "default", "object_type": "dfsfs", "mod_by": "user", "mod_time": "2019-06-25 13:09:47.543535", "_user": "user", "environment": ["dfsdfdfsd"], "description": "", "identifier": {"values": ["dfsdfdfdsffdfsdfs"], "fields": ["host"]}, "sdfdsfsfds": ["SMP"], "role": ["operating_system_host"], "mod_source": "REST", "_key": "afderea-be2d-47a6-9f0d-00857ereef6c", "version": ["2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "create_source": "unknown", "services": [{"title": "STitle", "_key": "865defee-d47f-4b8f-9435-bc4ere89e9b1f8d"}, {"title": "STitle2", "_key": "d9d5e231-3841-4376-a295-ea5fere95168482"}, {"title": "STitle3", "_key": "38165ff4-9da6-df-9a8b-a162aa7a68e8"}, {"title": "SSTitle5", "_key": "e2adb75e-9254-4774-b735-"}, {"title": "STitle6", "_key": "381f54d0-d759-43a3-94b3"}, {"title": "STitle7", "_key": "8253-f2b6a1d6f836"}, {"title": "STitle8", "_key": "bc69692b-48d8-4bd7-b62b"},{"id_name": "Item3", "informational": {"values": ["werwe", "werwwe", "8", "ewrwrw", "werewrew", "64432.5390625", "64432.55859375", "64432.36328125", "werw werwerw", "2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "fields": ["werwerw", "erwrwr", "wewrewrer", "werrwrwer", "werwerwrw", "werwerewr", "werwrwr", "stuff", "vendor_product", "version"]}, "role": ["Application Server"], "cpu_cores": ["8"], "create_time": "2017-04-03 16:32:27.738432", "mod_timestamp": "2019-06-26T01:17:23.933103+00:00", "title": "Item3", "family": ["dfsfd"], "OS": ["dfdsfsf"], "sdfdsfdsds": "fdsfdsf", "dsfdsfsd": ["64432.5390625", "64432.55859375", "64432.36328125"], "host": ["dfdsfsdfsdfds"], "sdfdsfds": "sdfdsf", "vend": ["sdada"], "permissions": {"delete": true, "write": true, "user": "dsdsds", "group": {"delete": true, "write": true, "read": true}, "read": true}, "sdsdsdsdsds": ["enttitle"=Item2, 'hostname=myHostname"], "_version": "3", "sgrp": "default", "object_type": "dfsfs", "mod_by": "user", "mod_time": "2019-06-25 13:09:47.543535", "_user": "user", "environment": ["dfsdfdfsd"], "description": "", "identifier": {"values": ["dfsdfdfdsffdfsdfs"], "fields": ["host"]}, "sdfdsfsfds": ["SMP"], "role": ["operating_system_host"], "mod_source": "REST", "_key": "afderea-be2d-47a6-9f0d-00857ereef6c", "version": ["2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "create_source": "unknown", "services": [{"title": "STitle", "_key": "865defee-d47f-4b8f-9435-bc4ere89e9b1f8d"}, {"title": "STitle2", "_key": "d9d5e231-3841-4376-a295-ea5fere95168482"}, {"title": "STitle3", "_key": "38165ff4-9da6-df-9a8b-a162aa7a68e8"}, {"title": "SSTitle4", "_key": "e2adb75e-9254-4774-b735-"}, {"title": "STitle6", "_key": "381f54d0-d759-43a3-94b3"}, {"title": "STitle7", "_key": "8253-f2b6a1d6f836"}, {"title": "STitle8", "_key": "bc69692b-48d8-4bd7-b62b"},{"id_name": "Item3", "informational": {"values": ["werwe", "werwwe", "8", "ewrwrw", "werewrew", "64432.5390625", "64432.55859375", "64432.36328125", "werw werwerw", "2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "fields": ["werwerw", "erwrwr", "wewrewrer", "werrwrwer", "werwerwrw", "werwerewr", "werwrwr", "stuff", "vendor_product", "version"]}, "role": ["Application Server"], "cpu_cores": ["8"], "create_time": "2017-04-03 16:32:27.738432", "mod_timestamp": "2019-06-26T01:17:23.933103+00:00", "title": "Item3", "family": ["dfsfd"], "OS": ["dfdsfsf"], "sdfdsfdsds": "fdsfdsf", "dsfdsfsd": ["64432.5390625", "64432.55859375", "64432.36328125"], "host": ["dfdsfsdfsdfds"], "sdfdsfds": "sdfdsf", "vend": ["sdada"], "permissions": {"delete": true, "write": true, "user": "dsdsds", "group": {"delete": true, "write": true, "read": true}, "read": true}, "sdsdsdsdsds": ["enttitle"=Item3, 'hostname=myHostname"], "_version": "3", "sgrp": "default", "object_type": "dfsfs", "mod_by": "user", "mod_time": "2019-06-25 13:09:47.543535", "_user": "user", "environment": ["dfsdfdfsd"], "description": "", "identifier": {"values": ["dfsdfdfdsffdfsdfs"], "fields": ["host"]}, "sdfdsfsfds": ["SMP"], "role": ["operating_system_host"], "mod_source": "REST", "_key": "afderea-be2d-47a6-9f0d-00857ereef6c", "version": ["2.6.32-754.6.3.el6.x86_64", "2.6.32-696.16.1.el6.x86_64", "2.6.32-696.3.1.el6.x86_64", "2.6.32-642.13.1.el6.x86_64"], "create_source": "unknown", "services": [{"title": "STitle", "_key": "865defee-d47f-4b8f-9435-bc4ere89e9b1f8d"}, {"title": "STitle2", "_key": "d9d5e231-3841-4376-a295-ea5fere95168482"}, {"title": "STitle3", "_key": "38165ff4-9da6-df-9a8b-a162aa7a68e8"}, {"title": "SSTitle5", "_key": "e2adb75e-9254-4774-b735-"}, {"title": "STitle6", "_key": "381f54d0-d759-43a3-94b3"}, {"title": "STitle7", "_key": "8253-f2b6a1d6f836"}, {"title": "STitle8", "_key": "bc69692b-48d8-4bd7-b62b"}]}]

enttitle= " 字段實際上包含我需要的確切值。 數據文件都是一行,沒有換行符。

這似乎是一種迂回的方式,但我設法得到了我想要的結果。 我願意接受更有效的建議。 但除此之外,這是我寫的,以防其他人需要做這樣的事情:

簡而言之:

  1. 從文件中獲取所有 JSON 數據並刪除特殊字符 -> 存儲在 var
  2. 使用“id_name”作為分隔符將新的字符串數據拆分為一個數組
  3. 循環遍歷數組並將每個元素添加到新數組中,僅當原始元素中存在 STtitle5 時
  4. 循環遍歷最新的數組並grep“id_name”和“informational”之間的模式,這是項目名稱,將其添加到新數組中

注意:我最終在開頭使用了 "id_name:Item1" 而不是 title:Item1...我更喜歡標題匹配,但標題出現在一百萬個地方。 現在 id_name 至少等於 title ......雖然它不需要。 我需要在未來更好地證明這一點。

        #!/bin/bash
        G_MATCH=$(grep -oP '(?=id_name).+?(?=id_name|}(](?!,)))' file.json | tr -d '[",:{}[]].')

        # split result to array using first name key as delim
        foo=$G_MATCH
        tmp="$foo"
        while [[ -n $tmp ]]; do
            tail=${tmp%id_name*}
            AR_ALL=("${tmp#$tail}" "${AR_ALL[@]}")
            tmp="$tail"
        done

        SUBSTRING="STitle5"

        # create new array only with items containing the specified service name
        AR_MATCHES=()
        for i in "${AR_ALL[@]}"
        do
                TEMP_VAL=$(echo $i|grep -o "$SUBSTRING")
                if [ -z "$TEMP_VAL" ]
                 then
                  echo "not adding..."
                else
                 AR_MATCHES+=("$i")
                fi
        done

        # create new array only with title names from first "name" key per line
        AR_TITLES=()
        for i in "${AR_MATCHES[@]}"
        do
                TEMP_VAL=$(echo $i|grep -oP '(?<=id_name )(.*)(?= informational)')
                if [ -z "$TEMP_VAL" ]
                 then
                  echo "not adding..."
                else
                 AR_TITLES+=("$TEMP_VAL")
                fi
        done

        printf '%s\n' "${AR_TITLES[@]}"

輸出:

        Item1
        Item3

這就是你要做的全部嗎?

$ awk '/"STitle5"/{gsub(/^"|",$/,"",$2); print $2}' file
Item1
Item3

$ arr=( $(awk '/"STitle5"/{gsub(/^"|",$/,"",$2); print $2}' file) )
$ echo "${arr[0]}"
Item1
$ echo "${arr[1]}"
Item3

更新:鑒於您的新單行輸入,您可以使用 GNU awk 為多字符 RS 執行此操作:

$ awk -v RS='{"id_name":' '/"SSTitle5"/{gsub(/^"|",$/,"",$1); print $1}' file
Item3
Item3

請注意,在您的新輸入中,所有項目都命名為“Item3”,“STtitle”已替換為“SSTitle”。 盡管有您的評論,但您的示例數據中沒有名為entitle字段。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM