Consider the code below. It converts an image to line art and then computes the md5sum of the bits. I don't know a better to do this than with a generator expression producing individual bits. But then how can I feed the result to md5 in an efficient way?
The code below does it with a bitarray object, but I get non-deterministic results handing bitarray
instances (which seem to use fancy C stuff under the hood) to md5. So what is the "right" way to do this?
import os, hashlib
from PIL import Image
from bitarray import bitarray
def image_pixel_hash_code(image):
pixels = list(image.getdata())
avg = sum(pixels) / len(pixels)
bits = bitarray(pixel < avg for pixel in pixels)
return hashlib.md5(bits).hexdigest()
im = Image.open(os.path.expanduser("~/Downloads/test.jpg")).convert("L")
print image_pixel_hash_code(im)
PS I can reproduce the bitarray non-determinism but I assumes it's just a function of using two things together that aren't supposed to work together.
The hash is including random bits at the end of bits
if the length of bits
is not a multiple of 8.
You can see this by looking at memoryview(bits)
You could fix this by padding bits
with 0
s
bits = bitarray(1 if pixel < avg else 0 for pixel in pixels)
bits.fill()
return hashlib.md5(bits).hexdigest()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.