简体   繁体   English

解码和编码数据后输出不同(base64)

[英]Different output after decode and encode data (base64)

If i run: 如果我运行:

import base64
data = open('1.dat', 'rb').read()
decoded = base64.b64decode(data)
encoded = base64.b64encode(decoded)
data == encoded

I get "False" as result? 结果是“假”? How to decode/encode to get the original result? 如何解码/编码以获得原始结果?

Base64 is not base64 unfortunately. 不幸的是,Base64不是base64。 There may be differences in the implementations. 实现可能有所不同。 Some implementation for example insert line breaks every 76 characters when encoding, some don't. 某些实现(例如插入行)在编码时每76个字符中断一次,而有些则没有。

You have to b64encode() the data before you b64decode() it: 您必须先对数据进行b64encode() ,然后再b64decode()

>>> import base64
>>> data = b"qwertzuiop"
>>> encoded = base64.b64encode(data)
>>> decoded = base64.b64decode(encoded)
>>> data == decoded
True

If your input file is already base64, you need to b64decode() it first, not encode. 如果输入文件已经是base64,则需要先b64decode()进行b64decode()而不是编码。 So your code should be this: 所以你的代码应该是这样的:

import base64
data = open('1.dat', 'rb').read()  # base64 encoded string
decoded = base64.b64decode(data)
encoded = base64.b64encode(decoded)
data == encoded

If you are getting False as a result, your data is Base64 encoded differently than what the base64 module does. 如果结果为False,则对Base64编码的data与对base64模块所做的编码不同。

There is some flexibility in the way base64 is encoded, for example the insertion of newlines. base64的编码方式具有一定的灵活性,例如插入换行符。 There may also be some alternate characters which the base64 module allows you to specify when encoding and decoding. 在编码和解码时, base64模块还允许您指定一些替代字符。 It's up to you to make sure the proper alternate characters are specified, but otherwise it's easy to compare two strings while ignoring any newlines or whitespace: 您可以确定是否指定了正确的替代字符,但是,在忽略任何换行符或空格的情况下比较两个字符串很容易:

''.join(data.split()) == ''.join(encoded.split())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM