简体   繁体   中英

Segmentation using maximum likelihood algorithm on images using python

I would like to perform image segmentation using maximum likelihood algorithm implemented in python. The mean vectors of the classes, and covariance matrices are known, and iterating over the images (which are quite big...5100X7020) we can calculate for each pixel the probability of being part of the given class.

Simply written in Python

import numpy as np
from numpy.linalg import inv
from numpy.linalg import det
...

probImage1 = []
probImage1Vector = []

norm = 1.0 / (np.power((2*np.pi), i/2) * np.sqrt(np.linalg.det(covMatrixClass1)))
covMatrixInverz = np.linalg.inv(covMatrixClass1)
for x in xrange(x_img):
    for y in xrange(y_img):
        X = realImage[x,y]
        pixelValueDifference = X - meanVectorClass1
        mult1 = np.multiply(-0.5,np.transpose(pixelValueDifference))
        mult2 = np.dot(covMatrixInverz,pixelValueDifference)
        multMult = np.dot(mult1,mult2)
        expo = np.exp(multMult)     
        probImage1Vector.append(np.multiply(norm,expo))
    probImage1.append(probImage1Vector)
    probImage1Vector = []

The problem that this code is very slow when performing on large images... The calculations like vector subtraction and multiplication consumes a lot of time, even though they are only 1X3 vectors.

Could you please give a hint how to speed up this code? I would really appreciate. Sorry, if I was not clear I am still beginner in python.

Taking a closer look at :

mult1 = np.multiply(-0.5,np.transpose(pixelValueDifference))
mult2 = np.dot(covMatrixInverz,pixelValueDifference)
multMult = np.dot(mult1,mult2)

We see that the operation is basically :

A.T (d) C (d) A         # where `(d)` is the dot-product

Those three steps could be easily expressed as one string notation in np.einsum , like so -

np.einsum('k,lk,l->',pA,covMatrixInverz,-0.5*pA)

Performing this across both iterators i(=x) and j(=y) , we would have a fully vectorized expression -

np.einsum('ijk,lk,ijl->ij',pA,covMatrixInverz,-0.5*pA))

Alternatively, we could perform the first part of sume-reduction with np.tensordot -

mult2_vectorized = np.tensordot(pA, covMatrixInverz, axes=([2],[1]))
output = np.einsum('ijk,ijk->ij',-0.5*pA, mult2_vectorized)

Benchmarking

Listing all approaches as functions -

# Original code posted by OP to return array
def org_app(meanVectorClass1, realImage, covMatrixInverz, norm):
    probImage1 = []
    probImage1Vector = []
    x_img, y_img = realImage.shape[:2]
    for x in xrange(x_img):
        for y in xrange(y_img):
            X = realImage[x,y]
            pixelValueDifference = X - meanVectorClass1
            mult1 = np.multiply(-0.5,np.transpose(pixelValueDifference))
            mult2 = np.dot(covMatrixInverz,pixelValueDifference)
            multMult = np.dot(mult1,mult2)
            expo = np.exp(multMult)     
            probImage1Vector.append(np.multiply(norm,expo))
            probImage1.append(probImage1Vector)
            probImage1Vector = []
    return np.asarray(probImage1).reshape(x_img,y_img)

def vectorized(meanVectorClass1, realImage, covMatrixInverz, norm):
    pA = realImage - meanVectorClass1
    mult2_vectorized = np.tensordot(pA, covMatrixInverz, axes=([2],[1]))
    return np.exp(np.einsum('ijk,ijk->ij',-0.5*pA, mult2_vectorized))*norm

def vectorized2(meanVectorClass1, realImage, covMatrixInverz, norm):
    pA = realImage - meanVectorClass1
    return np.exp(np.einsum('ijk,lk,ijl->ij',pA,covMatrixInverz,-0.5*pA))*norm

Timings -

In [19]: # Setup inputs
    ...: meanVectorClass1 = np.array([23.96000000, 58.159999, 61.5399])
    ...: 
    ...: covMatrixClass1 = np.array([[ 514.20040404,  461.68323232,  364.35515152],
    ...:        [ 461.68323232,  519.63070707,  446.48848485],
    ...:        [ 364.35515152,  446.48848485,  476.37212121]])
    ...: covMatrixInverz = np.linalg.inv(covMatrixClass1)
    ...: 
    ...: norm = 0.234 # Random float number
    ...: realImage = np.random.rand(1000,2000,3)
    ...: 

In [20]: out1 = org_app(meanVectorClass1, realImage, covMatrixInverz, norm )
    ...: out2 = vectorized(meanVectorClass1, realImage, covMatrixInverz, norm )
    ...: out3 = vectorized2(meanVectorClass1, realImage, covMatrixInverz, norm )
    ...: print np.allclose(out1, out2)
    ...: print np.allclose(out1, out3)
    ...: 
True
True

In [21]: %timeit org_app(meanVectorClass1, realImage, covMatrixInverz, norm )
1 loops, best of 3: 27.8 s per loop

In [22]: %timeit vectorized(meanVectorClass1, realImage, covMatrixInverz, norm )
1 loops, best of 3: 182 ms per loop

In [23]: %timeit vectorized2(meanVectorClass1, realImage, covMatrixInverz, norm )
1 loops, best of 3: 275 ms per loop

Looks like the fully vectorized einsum + tensordot hybrid solution is doing pretty good!

For further performance boost, one can also look into numexpr module to speedup the exponential computations on large arrays.

As a first step, I would get rid of unnecessary function calls like transpose, dot, and multiply. These are all simple calculations which you should be doing inline. When you can actually see what you are doing, instead of hiding things inside of functions, it will be easier to understand the performance problems.

The fundamental issue here is that this appears to be at least a quartic complexity operation. You might want to simply multiply out how many operations you are doing in all of your loops. Is it 500 million, 2 billion, 350 billion? How many?

To get control of performance you need to understand how many instructions you are doing. A modern computer can do about 1 billion instructions per second, but if memory movements are involved, it can be substantially slower.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM