[英]Struggling with parsing JSON with jq
我已經閱讀了所有與之相關的文章,已經玩了好幾個小時,但仍然無法掌握這個工具,如果我找到一種制作方法,這似乎正是我所需要的它可以根據我的需要工作...這是我的JSON示例:
{
"res": "0",
"main": {
"All": [
{
"field1": "a",
"field2": "aa",
"field3": "aaa",
"field4": "0",
"active": "true",
"id": "1"
},
{
"field1": "b",
"field2": "bb",
"field3": "bbb",
"field4": "0",
"active": "false",
"id": "2"
},
{
"field1": "c",
"field2": "cc",
"field3": "ccc",
"field4": "0",
"active": "true",
"id": "3"
},
{
"field1": "d",
"field2": "dd",
"field3": "ddd",
"field4": "0",
"active": "true",
"id": "4"
}
]
}
}
我想選擇性地提取一些字段,並獲得如下的csv輸出:
field1,field2,field3,id
a,aa,aaa,1
b,bb,bbb,2
c,cc,ccc,3
d,dd,ddd,4
請注意,我已經跳過了一些字段,並且我也對父數組等不感興趣。 非常感謝。
首先,您的JSON
需要進行如下修復:
{
"main": {
},
"table": {
"All": [
{
"field1": "a",
"field2": "aa",
"field3": "aaa",
"field4": "0",
"active": "true",
"id": "1"
},
{
"field1": "b",
"field2": "bb",
"field3": "bbb",
"field4": "0",
"active": "false",
"id": "2"
},
{
"field1": "c",
"field2": "cc",
"field3": "ccc",
"field4": "0",
"active": "true",
"id": "3"
},
{
"field1": "d",
"field2": "dd",
"field3": "ddd",
"field4": "0",
"active": "true",
"id": "4"
}
]
},
"res": "0"
}
其次,使用jq可以執行以下操作,以便使用column生成表輸出:
{ echo Field1 Field2 Field3 ID ; cat data.json | jq -r '.table.All[] | (.field1, .field2, .field3, .id)' | xargs -L4 } | column -t
輸出:
Field1 Field2 Field3 ID
a aa aaa 1
b bb bbb 2
c cc ccc 3
d dd ddd 4
使用sed :
echo "field1,field2,field3,id" ;cat data.json | jq -r '.table.All[] | (.field1, .field2, .field3, .id)' | xargs -L4 | sed 's/ /,/g'
輸出:
field1,field2,field3,id
a,aa,aaa,1
b,bb,bbb,2
c,cc,ccc,3
d,dd,ddd,4
更新:
無需使用sed或xargs , jq可以將輸出格式化為csv,如下所示:
cat data.json | jq -r '.table.All[] | [.field1, .field2, .field3, .id] | @csv'
輸出:
"a","aa","aaa","1"
"b","bb","bbb","2"
"c","cc","ccc","3"
"d","dd","ddd","4"
感謝chepner在評論中提到的標題,可以使用jq直接添加標頭,如下所示:
jq -r '(([["field1", "field2", "field3", "id"]]) + [(.table.All[] | [.field1,.field2,.field3,.id])])[]|@csv' data.json
輸出:
"field1","field2","field3","id"
"a","aa","aaa","1"
"b","bb","bbb","2"
"c","cc","ccc","3"
"d","dd","ddd","4"
根據您在問題中提供的最后JSON數據,此命令應正確運行:
jq -r '(([["field1", "field2", "field3", "id"]]) + [(.main.All[] | [.field1,.field2,.field3,.id])])[]|@csv' data.json
([[“ field1”,“ field2”,“ field3”,“ id”]]) :命令的第一部分用於csv標頭
(.main.All [] | [.field1,.field2,.field3,.id])]) :由於
main
是JSON的父級,因此您可以使用.main
選擇它,它將打印數組All
然后打印該數組的內容必須在該數組的名稱上添加[]
,完整命令將是.main.All[]
,它將打印多個字典,我們可以通過管道輸出.main.All[]
來指定所需的鍵.main.All[]
到另一個數組,其中包含我們想要的鍵,例如[.field1,.field2,.field3,.id]
這是一個僅需jq的解決方案,僅需要一次指定所需的鍵,例如在命令行上:
jq -r --argjson f '["field1", "field2", "field3", "id"]' '
$f, (.table.All[] | [getpath( $f[]|[.])]) | @csv'
輸出:
"field1","field2","field3","id"
"a","aa","aaa","1"
"b","bb","bbb","2"
"c","cc","ccc","3"
"d","dd","ddd","4"
避免引用字符串的一種方法是將其管道傳遞給join(",")
(或join(", ")
)而不是@csv
:
field1,field2,field3,id
a,aa,aaa,1
b,bb,bbb,2
c,cc,ccc,3
d,dd,ddd,4
當然,如果值包含逗號,這可能是不可接受的。 通常,如果避免在字符串周圍使用引號很重要,則可以考慮使用@tsv
作為一個不錯的選擇。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.