简体   繁体   中英

How can I put many 2D numpy arrays fast in a 4D numpy array?

I have about 150,000 images which I want to load in a numpy array of shape [index][y][x][channel] . Currently, I do it like this:

images = numpy.zeros((len(data), 32, 32, 1))
for i, fname in enumerate(data):
    img = scipy.ndimage.imread(fname, flatten=False, mode='L')
    img = img.reshape((1, img.shape[0], img.shape[1], 1))
    for y in range(32):
        for x in range(32):
            images[i][y][x][0] = img[0][y][x][0]

This works, but I think there must be a better solution than iterating over the elements. I could get rid of the reshaping, but this would still leave the two nested for-loops.

What is the fastest way to achive the same images 4D array, having 150,000 images which need to be loaded into it?

Generally you don't need to copy single elements when dealing with numpy-arrays. You can just specify the axis (if they are equal sized or broadcastable) you want to copy your array to and/or from:

images[i,:,:,0] = img[0,:,:,0]

instead of your loops. In fact you don't need the reshape at all:

images[i,:,:,0] = scipy.ndimage.imread(fname, flatten=False, mode='L')

These : specify that you want these axis to be preserved (not sliced) and numpy supports array to array assignments, for example:

>>> a = np.zeros((3,3,3))
>>> a[0, :, :] = np.ones((3, 3))
>>> a
array([[[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]])

or

>>> a = np.zeros((3,3,3))
>>> a[:, 1, :] = np.ones((3, 3))
>>> a
array([[[ 0.,  0.,  0.],
        [ 1.,  1.,  1.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 1.,  1.,  1.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 1.,  1.,  1.],
        [ 0.,  0.,  0.]]])

Essentially there are 2 approaches

res = np.zeros((<correct shape>), dtype)
for i in range(...):
   img = <load>
   <reshape if needed>
   res[i,...] = img

If you've chosen the initial shape of res correctly you should be able copy each image array into its slot without loop or much reshaping.

The other approach uses list append

alist = []
for _ in range(...):
   img = <load>
   <reshape>
   alist.append(img)
res = np.array(alist)

this collects all component arrays into a list, and uses np.array to join them into one array with a new dimension at the start. np.stack gives a little more power in selecting the concatenation axis.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM