简体   繁体   中英

How to break a JSON file into a smaller json wrapped in an array using jq?

[{"foo": 1},
 {"foo": 2},
 {"foo": 3},
 {"foo": 4},
 {"foo": 5},
 {"foo": 6},
 {"foo": 7},
 {"foo": 8},
 {"foo": 9},
 {"foo": 10},
 {"foo": 11},
 {"foo": 12},
 {"foo": 13},
 {"foo": 14},
 {"foo": 15}
]

I want to break this array into smaller array files using jq.

So far I have tried this

 cat foo.json | jq -c -M -s '.[]' | split -l 5 - charded/

This results in 3 separate files but does not wrap the dictionaries into an array.

jq IO is rather primitive, so I'd suggest starting with:

def chunks(n):
  def c: .[0:n], (if length > n then .[n:]|c else empty end);
  c;

chunks(5)

The key now is to use the -c command-line option:

jq -c -f chunk.jq foo.json

With your data, this will produce a stream of three arrays, one per line.

You can pipe that into split or awk or whatever, to send each line to a separate file, eg

awk '{n++; print > "out" n ".json"}'

If you want the arrays to be pretty-printed in each file, you could then use jq on each, perhaps with sponge , along the lines of:

for f in out*.json ; do jq . $f | sponge $f ; done

def-free solution

If you don't want to define a function, or prefer a one-liner for the jq component of the pipeline, consider this:

jq -c --argjson n 5 'recurse(.[$n:]; length > 0) | .[0:$n]' foo.json

Notes

  1. chunks will also work on strings.
  2. chunks defines the 0-arity function, c , to take advantage of jq's support for tail-call optimization.

If data.json is VERY large (eg, too big to fit comfortably into RAM), and if you have a version of jq that includes the so-called streaming parser, then you could use jq first to split up data.json into its top-level component elements, then regroup them, and finally use awk or split or whatever as described elsewhere on this page.

Invocation

Here first is the pipeline you'd use:

jq -cn --stream 'fromstream(1|truncate_stream(inputs))' data.json |
  jq -cn -f groups.jq

groups.jq

# Use nan as EOS
def groups(stream; n):
  foreach (stream,nan) as $x ([];
    if length < n then  . + [$x] else [$x] end;
    if (.[-1]|isnan) and length > 1 then .[:-1]
    elif length == n then .
    else empty end) ;

groups(inputs; 5)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM