Base64解码：特定的字符串不正确的填充（具有正确的填充）

Question

I am trying to Base64 decode a string (into bytes) using Python's base64.b64decode(str) method: 我正在尝试使用Python的base64.b64decode（str）方法对Base64解码字符串（转换为字节）：

46oWrWpy2gTEGwNnN6Ayy 46oWrWpy2gTEGwNnN6Ayy

and I am ensuring it has a multiple of 4 ='s for padding or out of frustration any of these: 并且我确保它具有4的倍数用于填充或出于以下原因而感到沮丧：

46oWrWpy2gTEGwNnN6Ayy= 46oWrWpy2gTEGwNnN6Ayy =

46oWrWpy2gTEGwNnN6Ayy== 46oWrWpy2gTEGwNnN6Ayy ==

46oWrWpy2gTEGwNnN6Ayy=== 46oWrWpy2gTEGwNnN6Ayy ===

46oWrWpy2gTEGwNnN6Ayy================================================== 46oWrWpy2gTEGwNnN6Ayy ================================================= =

and yet I still get "Incorrect Padding" on Python v3.6.1. 但是在Python v3.6.1上仍然出现“错误填充”。 Other strings are fine. 其他字符串也可以。

I show a colleague, he tries on Python 2 and observes the same response. 我向一位同事展示了他尝试使用Python 2并观察到了相同的响应。

I note removing the first "4" is enough to ensure the Base64 decode works. 我注意到删除第一个“ 4”足以确保Base64解码正常工作。

I have skim read Python's docs (noting casefold doesn't apply for Base64) and haven't yet ventured further into RFC3548 but wondered if someone else had encountered something similar before. 我已经略读了Python的文档（注意casefold不适用于Base64），并且还没有进一步冒险使用RFC3548，但是想知道以前是否有人遇到过类似的情况。 Anyone have any clues :)? 任何人都有任何线索:)？ Surely this can't be a bug in Python's Base64 decoder? 当然这不是Python的Base64解码器中的错误吗？

Answer 1

Worked it out. 解决了。

Each character of Base64 text is 6 bits of the raw's 8 bits. Base64文本的每个字符是原始8位中的6位。 If a character is mid-way through the raw's bytes then you are missing some remaining bits. 如果某个字符位于原始字节的中间，则您会丢失一些剩余的位。 The Wikipedia article (and many online answers) seems to use padding as interchangeable for a '0' byte which is not the case (in the Base64 dictionary it should be encoded as an A). Wikipedia文章（以及许多在线答案）似乎将填充与“ 0”字节互换使用，事实并非如此（在Base64词典中应将其编码为A）。

Padding is not interchangeable for missing data. 对于丢失的数据，填充不可互换。

#!/usr/bin/env python3

# We use hexlify for debugging.
import binascii

# We use the Base64 library.
import base64

# Base64 works on multiples of 4 characters..
# ..Sometimes we get 3/2/1 characters and it might be midway through another.
def relaxed_decode_base64(data):

 # If there is already padding we strim it as we calculate padding ourselves.
 if '=' in data:
  data = data[:data.index('=')]

 # We need to add padding, how many bytes are missing.
 missing_padding = len(data) % 4

 # We would be mid-way through a byte.
 if missing_padding == 1:
  data += 'A=='
 # Jut add on the correct length of padding.
 elif missing_padding == 2:
  data += '=='
 elif missing_padding == 3:
  data += '='

 # Actually perform the Base64 decode.
 return base64.b64decode(data)

# Debugging
print(str(relaxed_decode_base64('46oWrWpy2gTEGwNnN6Ayy')) + '\n')

testString = ''

for count in range(0, 1024):
 testString += '/'
 print(str(len(testString)) + ' - ' + testString)
 print(binascii.hexlify(relaxed_decode_base64(testString)))
 input()

Answer 2

Seems to be a problem in your data, not related to Python: 似乎是您数据中的问题，与Python无关：

$ echo 46oWrWpy2gTEGwNnN6Ayy | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy= | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy== | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy=== | base64 -d
ãªjrÚÄg7 2base64: invalid input
$ echo 46oWrWpy2gTEGwNnN6Ayy==== | base64 -d
ãªjrÚÄg7 2base64: invalid input

I managed to decode it this way (removed the last 'y'): 我设法以这种方式对其进行了解码（删除了最后一个“ y”）：

$ echo 46oWrWpy2gTEGwNnN6Ay | base64 -d
ãªjrÚÄg7 2

Base64解码：特定的字符串不正确的填充（具有正确的填充）

问题描述

2 个解决方案

解决方案1
2 已采纳 2017-06-07 10:48:17

解决方案2
0 2017-05-24 18:03:46

Base64解码：特定的字符串不正确的填充（具有正确的填充）

问题描述

2 个解决方案

解决方案1 2 已采纳 2017-06-07 10:48:17

解决方案2 0 2017-05-24 18:03:46

解决方案1
2 已采纳 2017-06-07 10:48:17

解决方案2
0 2017-05-24 18:03:46