简体   繁体   中英

Base64 Image Comparison

Let say I have two Images A and B, same size, same number of channels and same format (for example both RGB images of size 25x25 in PNG).

I want to compare these two images and give a score on how much these two images are different based on sum of the differences on each pixel . However, these images are encoded in Base64 format (like images in HTML pages).

My question is whether summing up the differences of each character in Base64 format necessarily gives an estimation on how A and B are different or similar?

I wrote a test in Python to generate three random images img1, img2, img3 . Then compare them first pixel by pixel and then compare their Base64 version. I tried to see if img1 is more similar to img2 or img3 and then test if both comparisons are correlated.

The answer is No . They are not correlated. In my test, they were not correlated for 488 times out of 1000 tests.

import numpy as np
import base64

# Generates an image with random values
def generate_image():
    img = np.random.random((10,10,3)) * 255
    return np.uint8(img)

# First comparison method based on pixel to pixel comparison
def compare_numpy(img1, img2):
    return np.sum( np.abs( img1 - img2 ) )

# Second comparison method based on comparing Base64 versions
def compare_base64(img1, img2):
    b1 = list(base64.b64encode(img1))
    b2 = list(base64.b64encode(img2))
    return sum( abs(b1[i] - b2[i]) for i in range(len(b1)))

# Test if both methods says if img1 is closer to img2 or img3
def correlation_test():
    img1 = generate_image()
    img2 = generate_image()
    img3 = generate_image()

    # img1 is closer to img2 or img3

    # Testing pixel to pixel comparison
    cmp12 = compare_numpy(img1, img2)
    cmp13 = compare_numpy(img1, img3)
    if cmp12 < cmp13:
        result1 = 2
    else:
        result1 = 3

    # Testing Base64 comparison
    cmp12 = compare_base64(img1, img2)
    cmp13 = compare_base64(img1, img3)
    if cmp12 < cmp13:
        result2 = 2
    else:
        result2 = 3

    return result1 == result2

true_cnt = 0
false_cnt = 0

# Running the test 1000 times
for i in range(1000):
    if correlation_test():
        true_cnt += 1
    else:
        false_cnt += 1

print(f"They are correlated {true_cnt} times")
print(f"They are not correlated {false_cnt} times")

# They are correlated 512 times
# They are not correlated 488 times

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM