How to create a multi-dimensional result by iterating over batches of arrays

Question

I'm iterating over some input batches and generating results that have shape (BatchSize, X, Y). The BatchSize is not necessarily the same as I loop over the batches. I'd like to return a single output which is the concatenated version of the results along the batch dimension. What's the most elegant way to do this in NumPy?

I'm not so much worried about the performance but rather dealing with the multi-dimensionality of the accumulated result array.

Answer 1

Assuming that you have enough memory to hold all of the results, a good solution is to simply pre-allocate the memory:

result = np.empty(OUTPUT_SHAPE)
i=0
while i < input_tensor.shape[0]:
    batch_size = get_batch_size(i)
    result[i:i+batch_size] = deal_with_batch(input_tensor[i:i+batch_size])
    i += batch_size

Answer 2

The answer by @Scott is correct. I was however looking for the incremental version which I think I've found:

Define results = np.empty((0, output_shape)) and then update it in the loop using results = np.concatenate((results, some_func(x)))

I'm not sure how I should think about a dimension of size 0 in numpy but it works.

How to create a multi-dimensional result by iterating over batches of arrays

Question

2 answers

solution1
1 2019-01-21 12:04:14

solution2
0 2019-01-21 14:09:23

How to create a multi-dimensional result by iterating over batches of arrays

Question

2 answers

solution1 1 2019-01-21 12:04:14

solution2 0 2019-01-21 14:09:23

solution1
1 2019-01-21 12:04:14

solution2
0 2019-01-21 14:09:23