是否有更快的方法使用numpy中的矢量化操作从大型二维数组恢复图像

Question

I have large 2-D array (typically 0.5 to 2GB) of dimension of nx 1008. This array contains the several images and the values in the array are actually the pixel value. 我有一个尺寸为nx 1008的大型2-D数组（通常为0.5至2GB）。该数组包含多个图像，并且数组中的值实际上是像素值。 Basically what is done to recover these images is as follow 基本上，恢复这些图像的方法如下

Start iterating over the array. 开始遍历数组。
Take the first 260 rows ie you will have 260*1008=262080 values. 采取前260行，即您将拥有260 * 1008 = 262080的值。
For the 261st row only take the first 64 values(the rest values in that row are junk). 对于第261行，仅采用前64个值（该行中的其余值为垃圾值）。 Thus now we have 262144 pixel values. 因此，现在我们有了262144个像素值。
Dump all these values in a 1-D array say dump and do np.reshape(dump, (512,512))) to obtain the image. 将所有这些值转储到一维数组中，例如dump，然后执行np.reshape（dump，（512,512）））获得图像。 Notice that 512x512=262144 请注意512x512 = 262144
Repeat the same thing starting from 262nd row again. 从第262行开始重复同样的事情。

This is my solution 这是我的解决方案

counter=0
dump=np.array([], dtype=np.uint16)
#pixelDat is the array shaped n x 1008 containing the pixel values
for j in xrange(len(pixelDat)):
    #Check if it is the last row for a particular image
    if(j == (260*(counter+1)+ counter)):
        counter += 1
        dump=np.append(dump, pixelDat[j][:64])
        #Reshape dump to form the image and write it to a fits file
        hdu = fits.PrimaryHDU(np.reshape(dump, (512,512)))
        hdu.writeto('img'+str("{0:0>4}".format(counter))+'.fits', clobber=True)
        #Clear dump to enable formation of next image
        dump=np.array([], dtype=np.uint16)
    else:
        dump=np.append(dump, pixelDat[j])

I have been wondering if there is a way to speed up this whole process. 我一直在想是否有办法加快整个过程。 The first thing that came to my mind is using vectorized numpy operations. 我想到的第一件事是使用矢量化的numpy操作。 However I am not very sure how to apply it in this case. 但是我不太确定如何在这种情况下应用它。

PS: Do not worry about the fits and hdu part. PS：不要担心装配和hdu部分。 Its just creating a .fits file for my image. 它只是为我的图像创建一个.fits文件。

Answer 1

Here is an attempt using flattening and np.split . 这是尝试使用flattening和np.split 。 It avoids copying data. 它避免了复制数据。

def chop_up(pixelDat):
    sh = pixelDat.shape
    try:
        # since the array is large we do not want a copy
        # the next line will succeed only if we can reshape in-place
        pixelDat.shape = -1
    except:
        return False # user must resort to other method
    N = len(pixelDat)
    split = (np.arange(0, N, 261*1008)[:, None] + (0, 512*512)).ravel()[1:]
    if split[-1] > N:
       split = split[:-2]
    result = [x.reshape(512,512) for x in np.split(pixelDat, split) if len(x) == 512*512]
    pixelDat.shape = sh
    return result

是否有更快的方法使用numpy中的矢量化操作从大型二维数组恢复图像

问题描述

1 个解决方案

解决方案1
2 2017-02-18 17:46:21

是否有更快的方法使用numpy中的矢量化操作从大型二维数组恢复图像

问题描述

1 个解决方案

解决方案1 2 2017-02-18 17:46:21

解决方案1
2 2017-02-18 17:46:21