简体   繁体   中英

Implementing Weighted Mean Absolute/Square difference in OpenCV

I am trying to implement some sort of correlation tracking between a template image and frames of video stream in Python-OpenCV

I am trying to use weighted mean absolute deviation (weighted MAD) as a similarity measure between the template and video frames (object should be at location of minimum MAD.).

the equation I need to do is:

wmad.png

where F is the image, T is the template and w is weight window (same size as template)

I am aware that open-cv provides function which does template matching (ie: cv2.matchTemplate ). The most close one to MAD is TM_SQDIFF_NORMED which is mean square deviation (MSD), I believe that open-cv implements this equation

msd.png

which will give the measure of similarity I want if there is a way to implement weight function inside it like this

wmsd.png

My question is how can I implement any of weighted MAD or weighted MSD in Open-CV without implementing loops myself (so as not to lose speed) utilizing cv2.matchTemplate function (or similar approach

You can do it with small matrix tricks. I will try to explain with an example.
If you have a 3x3 kernel with values of k_ij and 3x3 weight kernel w_ij, you can create 9 images from your original image by moving it once each time in each direction. You will end up with 9 images.
Now, you can flatten the kernel t and subtract it from the stacked 9 images. The result will be equivalent to moving kernel.
After taking the absolute value, you can do the same (flattening and multiplying) for w.
Finally, you can sum the tensor in the new axis and end up with the solution.

example of implementation:

def stack_image(image, n):
    channels = []
    row, col = image.shape
    for i in range(n):
        for j in range(n):
            channels.append(image[i:row-(n - i)+1, j:col-(n - j)+1])
    return np.stack(channels, axis=-1)

def weighted_mad(f, t, w):
    image_stack = stack_image(image=f, n=t.shape[0])
    image_stack = np.abs(image_stack - t.flatten()) * w.flatten()
    image_stack = image_stack.sum(axis=-1)

    norm = len(image_stack.flatten())
    return 1 / norm * image_stack

Notes:

  • my implementation does not process the borders ("valid"), one can implements it in other ways.
  • my implementation assumes a square kernel (kxk), but one can implement it with a rectangular one.
  • the solution will be efficient only if the kernel size is not too large.

I will answer my question inspired by this answer Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy

I decided to go with weighted MSD (Mean Square Deviation) as I can expand the square bracket and distribute the weight on the three terms. Here are the steps

1- Expanding the square bracket

[ wmad.png

2- Distributing window kernel on the expanded bracket 二龙

3- Distributing the two sum operators on each term we will end having three terms

3.a- Convolution between Image square (F^2) and Window(W)

3.b- -2 * Convolution between Image (F) and window*template (T * W, element-wise)

3.c- summation of template square T^2 * window (w) (element-wise)

and multiply by (1/(m*n)) at the end

here is how to do it in python open-cv

def wmsd( img, tmp ,W):
    # input: img= image
    # input: tmp= template
    # input: W= weighting window
    # return: msd_map= weighted mean square deviation map

    h,w = img.shape
    th,tw = tmp.shape
    img_sq=np.square(np.uint64(img))
    tmp_sq=np.square(np.uint64(tmp))

    p1=cv2.filter2D(img_sq,-1,cv2.flip(W,-1),0)

    WT=W*tmp
    p2=-2*cv2.filter2D(img,-1,cv2.flip(WT,-1),0)

    p3=np.sum(tmp_sq*W)     
    msd_map=(p1+p2+p3)/(th*tw)

    return msd_map

Seeing it this way, makes it easy to utilize open-cv power to make this operation quickly with good fps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM