[英]Extract multiple fields using scalars in jq
我试图仅从 JSON 文件及其完整路径中选择特定字段(结果来自 Elasticsearch)。
我的 JSON 文件:
{
"_index": "ships",
"_type": "doc",
"_id": "c36806c10a96a3968c07c6a222cfc818",
"_score": 0.057158414,
"_source": {
"user_email": "admin@example.com",
"current_send_date": 1552557382,
"next_send_date": 1570798063,
"data_name": "atari",
"statistics": {
"game_mode": "engineer",
"opened_game": 0,
"user_score": 0,
"space_1": {
"ship_send_priority": 10,
"ssl_required": "true",
"ship_send_delay": 15,
"user_score": 0,
"template1": {
"current_ship_status": "sent",
"current_ship_date": "4324242",
"checked_link_before_clicked": 0
},
"template2": {
"current_ship_status": "sent",
"current_ship_date": "4324242",
"checked_payload": 0
}
}
}
}
}
我正在将钥匙转换为一个衬垫:
<file jq -c 'paths(scalars) as $p | [$p, getpath($p)]'
[["_index"],"ships"]
[["_type"],"doc"]
[["_id"],"c36806c10a96a3968c07c6a222cfc818"]
[["_score"],0.057158414]
[["_source","user_email"],"admin@example.com"]
[["_source","current_send_date"],1552557382]
[["_source","next_send_date"],1570798063]
[["_source","data_name"],"atari"]
[["_source","statistics","game_mode"],"engineer"]
[["_source","statistics","opened_game"],0]
[["_source","statistics","user_score"],0]
[["_source","statistics","space_1","ship_send_priority"],10]
[["_source","statistics","space_1","ssl_required"],"true"]
[["_source","statistics","space_1","ship_send_delay"],15]
[["_source","statistics","space_1","user_score"],0]
[["_source","statistics","space_1","template1","current_ship_status"],"sent"]
[["_source","statistics","space_1","template1","current_ship_date"],"4324242"]
[["_source","statistics","space_1","template1","checked_link_before_clicked"],0]
[["_source","statistics","space_1","template2","current_ship_status"],"sent"]
[["_source","statistics","space_1","template2","current_ship_date"],"4324242"]
[["_source","statistics","space_1","template2","checked_payload"],0]
然后我将输出通过管道传递给 grep 以提取我想要的所有字段:
<file jq -c 'paths(scalars) as $p | [$p, getpath($p)]' | grep -e '"_index"\|current_send_date\|current_send_date\|ship_send_delay\|ship_send_priority\|current_ship_status'
[["_index"],"ships"]
[["_source","current_send_date"],1552557382]
[["_source","statistics","space_1","ship_send_priority"],10]
[["_source","statistics","space_1","ship_send_delay"],15]
[["_source","statistics","space_1","template1","current_ship_status"],"sent"]
[["_source","statistics","space_1","template2","current_ship_status"],"sent"]
最后,我将 grep 的输出传送到 sed 并清理我不需要的字符,结果是我想要的:
<file jq -c 'paths(scalars) as $p | [$p, getpath($p)]' | grep -e '"_index"\|current_send_date\|current_send_date\|ship_send_delay\|ship_send_priority\|current_ship_status' | sed -e 's/\[\["//g' -e 's/","/./g' -e 's/"],"/=/g' -e 's/"],/=/g' -e 's/]$//g' -e 's/"$//g'
_index=ships
_source.current_send_date=1552557382
_source.statistics.space_1.ship_send_priority=10
_source.statistics.space_1.ship_send_delay=15
_source.statistics.space_1.template1.current_ship_status=sent
_source.statistics.space_1.template2.current_ship_status=sent
我正在寻找一种更好的方法来至少从不使用 grep 的 jq 中提取字段。 我可以使用 SED 进行内容准备,但我觉得必须有更好的方法来获取我不想使用 grep 的字段。 我相信一定有一些 select(.mykey|.mykey1|.mykey2) 可以做到这一点。
使用join
和字符串插值( \\(...)
):
$ jq -r 'paths(scalars) as $p | "\($p|join("."))=\(getpath($p))"' file
_index=ships
_type=doc
_id=c36806c10a96a3968c07c6a222cfc818
_score=0.057158414
_source.user_email=admin@example.com
_source.current_send_date=1552557382
_source.next_send_date=1570798063
_source.data_name=atari
_source.statistics.game_mode=engineer
_source.statistics.opened_game=0
_source.statistics.user_score=0
_source.statistics.space_1.ship_send_priority=10
_source.statistics.space_1.ssl_required=true
_source.statistics.space_1.ship_send_delay=15
_source.statistics.space_1.user_score=0
_source.statistics.space_1.template1.current_ship_status=sent
_source.statistics.space_1.template1.current_ship_date=4324242
_source.statistics.space_1.template1.checked_link_before_clicked=0
_source.statistics.space_1.template2.current_ship_status=sent
_source.statistics.space_1.template2.current_ship_date=4324242
_source.statistics.space_1.template2.checked_payload=0
实际上,如果你有最新版本的 jq,你甚至不需要 grep,试试这个:
(paths(scalars) | select(IN(.[];
"_index",
"current_send_data",
"ship_send_delay",
"ship_send_priority",
"current_ship_status"
))) as $p | "\($p|join("."))=\(getpath($p))"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.