Code taking too long to execute while doing batch processing of Image

Question

I am trying to calculate the mean temperature for a certain range of zenith angle for every image.

I am using a for loop to perform the above-mentioned task. Inside the loop, I am calculating the distance of each pixel from the center and then applying conditions based on the distance.

The image has lots of unwanted reflections, which I am removing using the df_con data frame.

The loop is taking 1 min 30 sec to process all the operations on one image(took 38 mins for just 24 images). Is there a way to improve the speed of the code.

### Reading all the images inside the Images folder
X_data = []
files = glob.glob ("Images/*.jpg"). #  Total 17,000 images of(480*640*3)
files.sort(key=os.path.getmtime, reverse=True)


X_data = [cv.imread(img) for img in files]

image_data = np.array(X_data)

T_hot = np.array([])

for r in tqdm(range(image_data[:,0,0,0].size)):

    ##Converting RGB image to grey scale image 
    grey = cv.cvtColor(image_data[r],cv.COLOR_BGR2GRAY)


    Z = grey.reshape(-1,1)

    Tem = np.array([])

    Tmax = 25 
    Tmin = -10 

    Zmax = 255 
    Zmin = 0
    c = -10

    m = (Tmax - Tmin) / (Zmax - Zmin)

    zenith = np.array([])
    theta = np.around(np.arange(0,90,90/200),6)

    for i in range(0,480):
        for j in range(0,640):

            # Calculating distance of each pixel from the center.
            r = np.around(np.sqrt((332 - j)**2 + (235 - i)**2))

            # Assigning zxenith angle to each pxl.
            # Calculating Temperature of indexed pxl.

            if r < 200:
                k = theta[theta == np.around((r*90/200),6)]
                zenith = np.append(zenith,k)
                T =  (m*grey[i,j]) + c 
                Tem = np.append(Tem,T)
            else:
                k = 120
                zenith = np.append(zenith,k) 
                T = 255
                Tem = np.append(Tem,T)



    # creating panda dataframe 
    df = pd.DataFrame({'Pxl':Z[:,0],'Tem':Tem[:],'zenith':zenith[:]})

    # Fetching the Image mask data points 
    df_con = pd.read_excel('contour.xlsx')

    dataset_final = pd.merge(df,df_con, how='outer', on=None, \
                            left_index=True, right_index=True, sort=True)
    dataset_final = dataset_final[dataset_final['pxl_new'] < 255]

    df_0 = pd.DataFrame(0, index=range(Z.shape[0]), columns={'Null'}) 

    df_image = pd.merge(dataset_final,df_0, how='outer', on=None, \
                            left_index=True, right_index=True,\
                         sort=True)

    df_image = df_image[['Pxl','Tem','zenith']].fillna(255)


    df_target = dataset_final[(dataset_final['zenith'] >= 65) & \
                              (dataset_final['zenith'] <= 85)]
    mean = np.mean(df_target[['Tem']].values)
    T_hot = np.append(T_hot, mean)

Answer 1

So, after 3 long days of struggle, I managed to partially solve the problem. I noticed three things in my code:

I was reading all the images in color and later converting it into Grayscale which was an unnecessary extra step and causing a lot of computation time. Thanks to @Mark Setchell he pointed that out for me.

X_data = []

files = glob.glob ("Images/*.jpg")
files.sort(key=os.path.getmtime, reverse=True)
for img in tqdm(files):
# Reading images in greyscale
    X = cv.imread(img,0);
    X_data.append(X)

I was using the panda data frame inside the loop to perform the indexing operation. Although the panda data frame makes our life easy, it takes a hell of computation time. So, I decided to go with numpy array instead of panda DataFrame .
I implemented multiprocessing for the below code block.

for i in range(0,480):
        for j in range(0,640):

            r = np.around(np.sqrt((332 - j)**2 + (235 - i)**2))

            if r < 200:
                k = theta[theta == np.around((r*90/200),6)]
                zenith = np.append(zenith,k)
                T =  (m*grey[i,j]) + c 
                Tem = np.append(Tem,T)
            else:
                k = 120
                zenith = np.append(zenith,k) 
                T = 255
                Tem = np.append(Tem,T)

Now the new code block is something like this:

def temperature(count):
    zenith = np.array([])
    Tem = np.array([])

    theta = np.around(np.arange(0, 90, 90 / 200), 6)
    for j in range(0, 640):
        r = np.around(np.sqrt((332 - j) ** 2 + (235 - count) ** 2))

        if r < 200:
            k = theta[theta == np.around((r * 90 / 200), 6)]
            zenith = np.append(zenith,k)
            T = (m * grey[count, j]) + c
            Tem = np.append(Tem,T)
        else:
            k = 120
            zenith = np.append(zenith,k)
            T = 20
            Tem = np.append(Tem,T)

    result = np.vstack((zenith, Tem)).T
    return np.array(result)

if __name__ == '__main__':
        pool = Pool(cpu_count())
        result = pool.map(temperature, range(0,480))
        pool.close()
        res = np.array(result)
        Tem = res[:,:,1].reshape(-1,1)
        zenith = res[:,:,0].reshape(-1,1)

By implementing the above changes, I managed to reduce the processing time for single image from 1min30sec to 2sec . I am sure there are better ways to optimize this further. Please feel free to offer your solution. It will be a big help for a fresher like me.

Code taking too long to execute while doing batch processing of Image

Question

1 answers

solution1
0 2020-05-22 10:39:33

Code taking too long to execute while doing batch processing of Image

Question

1 answers

solution1 0 2020-05-22 10:39:33

solution1
0 2020-05-22 10:39:33