简体   繁体   中英

How to create a multi-dimensional result by iterating over batches of arrays

I'm iterating over some input batches and generating results that have shape (BatchSize, X, Y). The BatchSize is not necessarily the same as I loop over the batches. I'd like to return a single output which is the concatenated version of the results along the batch dimension. What's the most elegant way to do this in NumPy?

I'm not so much worried about the performance but rather dealing with the multi-dimensionality of the accumulated result array.

Assuming that you have enough memory to hold all of the results, a good solution is to simply pre-allocate the memory:

result = np.empty(OUTPUT_SHAPE)
i=0
while i < input_tensor.shape[0]:
    batch_size = get_batch_size(i)
    result[i:i+batch_size] = deal_with_batch(input_tensor[i:i+batch_size])
    i += batch_size

The answer by @Scott is correct. I was however looking for the incremental version which I think I've found:

Define results = np.empty((0, output_shape)) and then update it in the loop using results = np.concatenate((results, some_func(x)))

I'm not sure how I should think about a dimension of size 0 in numpy but it works.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM