How can I combine a sequence of JSON with jq without using the slurp flag?

Question

I have a ton of records (~4,500) that I've processed (using jq) down to a sequence of JSON grouped by hourly UTC time (~680 groups, all unique).

{
    "2018-10-09T19:00:00.000Z": []
}
{
    "2018-10-09T20:00:00.000Z": []
}
{
    "2018-10-09T21:00:00.000Z": []
}

I'm pretty sure you can see where this is going, but I want to combine all these into a single JSON object to hand over to another system for more fun.

{
    "2018-10-09T19:00:00.000Z": [],
    "2018-10-09T20:00:00.000Z": [],
    "2018-10-09T21:00:00.000Z": []
}

The last two things I'm doing before I get to the sequence of objects is:

group_by(.day)[] | { (.[0].day): . }

Where .day is the ISO Date you see referenced above.

I've tried a few things around map and reduce functions, but can't seem to massage the data the way I want. I've spent a few hours on this and need to take a break, so any help or direction you can point me would be great!

Answer 1

If everything is already in memory, you could modify the group_by line as follows:

reduce group_by(.day)[] as $in ({}; . + { ($in[0].day): $in }

Alternatives to `group_by`

Since group_by entails a sort, it may be unnecessarily inefficient. You might like to consider using a variant such as the following:

# sort-free variant of group_by/1
# f must always evaluate to an integer or always to a string.
# Output: an array in the former case, or an object in the latter case
def GROUP_BY(f): reduce .[] as $x ({}; .[$x|f] += [$x] );

Answer 2

If the stream of objects is already in a file, use inputs with the -n command-line option.

This will avoid the overhead of "slurping" but will still require enough RAM for the entire result to fit into memory. If that doesn't work for you, then you will have to resort to desperate measures :-)

This might be a useful starting point:

jq -n 'reduce inputs as $in ({}; . + $in)'

How can I combine a sequence of JSON with jq without using the slurp flag?

Question

2 answers

solution1
1 ACCPTED 2018-11-21 22:51:56

Alternatives to `group_by`

solution2
0 2018-11-21 21:59:01

How can I combine a sequence of JSON with jq without using the slurp flag?

Question

2 answers

solution1 1 ACCPTED 2018-11-21 22:51:56

Alternatives to group_by

solution2 0 2018-11-21 21:59:01

solution1
1 ACCPTED 2018-11-21 22:51:56

Alternatives to `group_by`

solution2
0 2018-11-21 21:59:01