简体   繁体   English

Python,获取图像对象的 base64 编码的 MD5 哈希

[英]Python, get base64-encoded MD5 hash of an image object

I need to get a base64-encoded MD5 hash of an object, where the object is an image stored as a file, fname.我需要获取对象的 base64 编码的 MD5 哈希,其中对象是存储为文件 fname 的图像。

I've tried this:我试过这个:

def get_md5(fname):
    hash = hashlib.md5()
    with open(fname) as f:
        for chunk in iter(lambda: f.read(4096), ""):
            hash.update(chunk)
    return hash.hexdigest().encode('base64').strip()

However, I don't think this is right because it returns a string with too many characters.但是,我认为这是不对的,因为它返回一个包含太多字符的字符串。 My understanding is that it needs to be 24 characters long.我的理解是它需要是 24 个字符长。 I get我得到

NjJiM2RlOWMzOTYxYmM3MDI5Y2Q1NzdjOTQ5YWRlYTQ=

I've tried a few other similar ways as well, for example, one that does not do the chunk loop thing.我也尝试了其他一些类似的方法,例如,一种不执行块循环的方法。 They all return the same string.它们都返回相同的字符串。

(My later actions that need the base64-encoded MD5 hash fail, and I'm thinking this could be why.) (我后来需要 base64 编码的 MD5 哈希的操作失败了,我想这可能是原因。)

I was able to make it work by using digest() instead of hexdigest().我能够通过使用digest() 而不是hexdigest() 使其工作。 Then the last line becomes:然后最后一行变成:

return hash.digest().encode('base64').strip()

The result was then 24 characters long, and it was accepted by Google Cloud Storage transfer, which required a base64-encoded MD5 hash.结果是 24 个字符长,并被 Google Cloud Storage 传输接受,这需要 base64 编码的 MD5 哈希。

First, base64 encoding makes strings longer.首先,base64 编码使字符串更长。 (Example using IPython with Python 3): (在 Python 3 中使用 IPython 的示例):

In [1]: s = '123456789012345678901234'

In [2]: len(s)
Out[2]: 24

In [3]: import base64

In [4]: e = base64.b64encode(s.encode('utf8'))

In [5]: len(e)
Out[5]: 32

In [6]: e
Out[6]: b'MTIzNDU2Nzg5MDEyMzQ1Njc4OTAxMjM0'

With base64 encoding you get 8 bits of output for every 6 bits of input.使用 base64 编码,每 6 位输入可以获得 8 位输出。

In [7]: 32/24
Out[7]: 1.333

In [8]: 8/6
Out[8]: 1.333

The base64 alphabet uses 64 (or 2**6) different symbols. base64 字母表使用 64(或 2**6)个不同的符号。 Generally they include lower- and uppercase letters, the digits 0-9.通常它们包括小写和大写字母,数字 0-9。 This leaves two extra required symbols and a pading character.这留下了两个额外的必需符号和一个填充字符。 Often + and / are used as symbols, but there are variations.通常+/用作符号,但也有变化。 Especially since / is not allowed in UNIX or MS-Windows filenames.特别是因为/在 UNIX 或 MS-Windows 文件名中是不允许的。

Second, using a hexadecimal representation doubles the length of a byte string;其次,使用十六进制表示会使字节串的长度加倍 the hex representation of one byte can vary between 00 and FF.一个字节的十六进制表示可以在 00 和 FF 之间变化。 Example (again using IPython and Python 3):示例(再次使用 IPython 和 Python 3):

In [1]: import hashlib

In [2]: s = b'this is a simple test'

In [3]: len(hashlib.md5(s).digest())
Out[3]: 16

In [4]: len(hashlib.md5(s).hexdigest())
Out[4]: 32

If you are going to use base64 encoding anyway, it makes no sense to use hexdigest() .如果您无论如何都打算使用 base64 编码,那么使用hexdigest()是没有意义的。

I was generating a hash of inline javascript with base64 encoding for the browser's CSP hash so the above-accepted answer was giving the following error.我正在为浏览器的 CSP 哈希生成带有 base64 编码的内联 javascript 哈希,因此上述接受的答案给出了以下错误。 The reason is, all types of strings are not handled properly.原因是,所有类型的字符串都没有正确处理。

AttributeError: 'bytes' object has no attribute 'encode' AttributeError: 'bytes' 对象没有属性 'encode'

Since Unicode-objects must be encoded before hashing.由于 Unicode 对象必须在散列之前进行编码。 I am encoding it through inline.encode('utf-8') in below code.我在下面的代码中通过 inline.encode('utf-8') 对其进行编码。

To solve that issue, please use the following way.要解决该问题,请使用以下方法。 This works like a charm.这就像一个魅力。

import hashlib
import base64

base64hash=base64.b64encode(hashlib.sha256(inline.encode('utf-8')).digest())
sha = "sha256-" + base64hash.decode("utf-8")
print(sha)

==> This is generating a sha256 hash for a string with base64 encoding. ==> 这是为 base64 编码的字符串生成 sha256 哈希。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在ReportLab生成的PDF中包含base64编码的图像 - Including base64-encoded image in ReportLab-generated PDF 如何在 AppEngine 中存储 base64 编码的图像? - How do I store a base64-encoded image in AppEngine? base64编码图像; binascii.Error: 无效的 base64 编码 - base64 encoding image; binascii.Error: Invalid base64-encoded 是否有可能在python中获取临时文件的md5哈希? - Is it possible to get the md5 hash of a tempfile in python? 在 Python 中获取大文件的 MD5 哈希值 - Get MD5 hash of big files in Python Python 中的 MD5 哈希 - MD5 hash in Python 如何从发布的base64编码图像创建MongoDB / mongoengine ImageField? - How to Create a MongoDB/mongoengine ImageField from POSTed base64-encoded image? 从 DER 格式的 base64 编码公钥到 COSE 密钥,在 Python - From base64-encoded public key in DER format to COSE key, in Python 尝试将 BufferedReader 上传到 Python 中的 Azure Blob 存储时,base64 编码的字符串无效 - Invalid base64-encoded string when trying to upload BufferedReader to Azure Blob Storage in Python python:无效的base64编码字符串:数据字符数(5)不能比4的倍数多1 - python: Invalid base64-encoded string: number of data characters (5) cannot be 1 more than a multiple of 4
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM