I have a huge json file(15 GB) which looks like as follows:
{
"userActivities": {
"-L3ATRosRd-bDgSmX75Z": {
"deviceId": "60ee32c2fae8dcf0",
"dow": "Friday"
}
},
"users": {
"0GTDyAepIjcKMB1XulHCYLXylFS2": {
"ageRangeMin": 21,
"age_range": {
"min": 21
},
"gender": "male"
},
"0GTDyAepIjcKMB1S2": {
"ageRangeMin": 22,
"age_range": {
"min": 20
},
"gender": "male"
}
}
}
I want to extract the objects as if by .users[]
, but using the streaming parser ( jq --stream
). That is, I want my output to be as follows:
{"ageRangeMin":21,"age_range":{"min":21},"gender":"male"}
{"ageRangeMin":22,"age_range":{"min":20},"gender":"male"}
Any guidance/help is greatly appreciated. I'm unable to understand how jq --stream
works.
If the goal is to just get objects at a certain depth of the json object tree, you can just truncate the stream.
$ jq --stream -nc 'fromstream(2|truncate_stream(inputs | select(.[0][:1] == ["users"])))'
Just make sure you're running the latest available jq. There's a bug in 1.5 for truncate_stream/1
that breaks for any other input greater than 1
.
With your input in input.json, the following invocation:
$ jq -nc --stream '
fromstream(inputs|select(.[0][0] == "users"))|.[][]' input.json
yields:
{"ageRangeMin":21,"age_range":{"min":21},"gender":"male"}
{"ageRangeMin":22,"age_range":{"min":20},"gender":"male"}
The idea is to extract the "users" key-value pair first as a single-key object.
Note that the -n option must be used here.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.