[英]Base64 Decode : Specific String Incorrect Padding (with correct padding)
I am trying to Base64 decode a string (into bytes) using Python's base64.b64decode(str) method: 我正在尝试使用Python的base64.b64decode(str)方法对Base64解码字符串(转换为字节):
46oWrWpy2gTEGwNnN6Ayy
46oWrWpy2gTEGwNnN6Ayy
and I am ensuring it has a multiple of 4 ='s for padding or out of frustration any of these: 并且我确保它具有4的倍数用于填充或出于以下原因而感到沮丧:
46oWrWpy2gTEGwNnN6Ayy=
46oWrWpy2gTEGwNnN6Ayy =
46oWrWpy2gTEGwNnN6Ayy==
46oWrWpy2gTEGwNnN6Ayy ==
46oWrWpy2gTEGwNnN6Ayy===
46oWrWpy2gTEGwNnN6Ayy ===
46oWrWpy2gTEGwNnN6Ayy==================================================
46oWrWpy2gTEGwNnN6Ayy ================================================= =
and yet I still get "Incorrect Padding" on Python v3.6.1. 但是在Python v3.6.1上仍然出现“错误填充”。 Other strings are fine.
其他字符串也可以。
I show a colleague, he tries on Python 2 and observes the same response. 我向一位同事展示了他尝试使用Python 2并观察到了相同的响应。
I note removing the first "4" is enough to ensure the Base64 decode works. 我注意到删除第一个“ 4”足以确保Base64解码正常工作。
I have skim read Python's docs (noting casefold doesn't apply for Base64) and haven't yet ventured further into RFC3548 but wondered if someone else had encountered something similar before. 我已经略读了Python的文档 (注意casefold不适用于Base64),并且还没有进一步冒险使用RFC3548,但是想知道以前是否有人遇到过类似的情况。 Anyone have any clues :)?
任何人都有任何线索:)? Surely this can't be a bug in Python's Base64 decoder?
当然这不是Python的Base64解码器中的错误吗?
Worked it out. 解决了。
Each character of Base64 text is 6 bits of the raw's 8 bits. Base64文本的每个字符是原始8位中的6位。 If a character is mid-way through the raw's bytes then you are missing some remaining bits.
如果某个字符位于原始字节的中间,则您会丢失一些剩余的位。 The Wikipedia article (and many online answers) seems to use padding as interchangeable for a '0' byte which is not the case (in the Base64 dictionary it should be encoded as an A).
Wikipedia文章(以及许多在线答案)似乎将填充与“ 0”字节互换使用,事实并非如此(在Base64词典中应将其编码为A)。
Padding is not interchangeable for missing data. 对于丢失的数据,填充不可互换。
#!/usr/bin/env python3
# We use hexlify for debugging.
import binascii
# We use the Base64 library.
import base64
# Base64 works on multiples of 4 characters..
# ..Sometimes we get 3/2/1 characters and it might be midway through another.
def relaxed_decode_base64(data):
# If there is already padding we strim it as we calculate padding ourselves.
if '=' in data:
data = data[:data.index('=')]
# We need to add padding, how many bytes are missing.
missing_padding = len(data) % 4
# We would be mid-way through a byte.
if missing_padding == 1:
data += 'A=='
# Jut add on the correct length of padding.
elif missing_padding == 2:
data += '=='
elif missing_padding == 3:
data += '='
# Actually perform the Base64 decode.
return base64.b64decode(data)
# Debugging
print(str(relaxed_decode_base64('46oWrWpy2gTEGwNnN6Ayy')) + '\n')
testString = ''
for count in range(0, 1024):
testString += '/'
print(str(len(testString)) + ' - ' + testString)
print(binascii.hexlify(relaxed_decode_base64(testString)))
input()
Seems to be a problem in your data, not related to Python: 似乎是您数据中的问题,与Python无关:
$ echo 46oWrWpy2gTEGwNnN6Ayy | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy= | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy== | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy=== | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy==== | base64 -d
ãªjrÚÄg7 2base64: invalid input
I managed to decode it this way (removed the last 'y'): 我设法以这种方式对其进行了解码(删除了最后一个“ y”):
$ echo 46oWrWpy2gTEGwNnN6Ay | base64 -d
ãªjrÚÄg7 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.