简体   繁体   English

如何在Python中用Base64编码long?

[英]How to encode a long in Base64 in Python?

In Java, I can encode a BigInteger as: 在Java中,我可以将BigInteger编码为:

java.math.BigInteger bi = new java.math.BigInteger("65537L");
String encoded = Base64.encodeBytes(bi.toByteArray(), Base64.ENCODE|Base64.DONT_GUNZIP);

// result: 65537L encodes as "AQAB" in Base64

byte[] decoded = Base64.decode(encoded, Base64.DECODE|Base64.DONT_GUNZIP);
java.math.BigInteger back = new java.math.BigInteger(decoded);

In C#: 在C#中:

System.Numerics.BigInteger bi = new System.Numerics.BigInteger("65537L");
string encoded = Convert.ToBase64(bi);
byte[] decoded = Convert.FromBase64String(encoded);
System.Numerics.BigInteger back = new System.Numerics.BigInteger(decoded);

How can I encode long integers in Python as Base64-encoded strings? 如何在Python中将长整数编码为Base64编码的字符串? What I've tried so far produces results different from implementations in other languages (so far I've tried in Java and C#), particularly it produces longer-length Base64-encoded strings. 到目前为止我所尝试的结果与其他语言的实现不同(到目前为止我已经尝试过Java和C#),特别是它生成了更长的Base64编码字符串。

import struct
encoded = struct.pack('I', (1<<16)+1).encode('base64')[:-1]
# produces a longer string, 'AQABAA==' instead of the expected 'AQAB'

When using this Python code to produce a Base64-encoded string, the resulting decoded integer in Java (for example) produces instead 16777472 in place of the expected 65537 . 当使用此Python代码生成Base64编码的字符串时,Java中生成的解码整数(例如)生成16777472而不是预期的65537 Firstly, what am I missing? 首先,我错过了什么?

Secondly, I have to figure out by hand what is the length format to use in struct.pack ; 其次,我必须手工弄清楚struct.pack使用的长度格式是什么; and if I'm trying to encode a long number (greater than (1<<64)-1 ) the 'Q' format specification is too short to hold the representation. 如果我试图编码一个长数字(大于(1<<64)-1 ),那么'Q'格式规范太短而无法保存表示。 Does that mean that I have to do the representation by hand, or is there an undocumented format specifier for the struct.pack function? 这是否意味着我必须手工完成表示,或者struct.pack函数是否有未记录的格式说明符? (I'm not compelled to use struct , but at first glance it seemed to do what I needed.) (我没有被迫使用struct ,但乍一看似乎做了我需要的东西。)

Check out this page on converting integer to base64 . 有关将整数转换为base64的信息,请查看此页面。

import base64
import struct

def encode(n):
    data = struct.pack('<Q', n).rstrip('\x00')
    if len(data)==0:
        data = '\x00'
    s = base64.urlsafe_b64encode(data).rstrip('=')
    return s

def decode(s):
    data = base64.urlsafe_b64decode(s + '==')
    n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
    return n[0]

The struct module : struct模块

… performs conversions between Python values and C structs represented as Python strings. ...执行Python值和表示为Python字符串的C结构之间的转换。

Because C doesn't have infinite-length integers, there's no functionality for packing them. 因为C没有无限长度的整数,所以没有包装它们的功能。

But it's very easy to write yourself. 但是写自己很容易。 For example: 例如:

def pack_bigint(i):
    b = bytearray()
    while i:
        b.append(i & 0xFF)
        i >>= 8
    return b

Or: 要么:

def pack_bigint(i):
    bl = (i.bit_length() + 7) // 8
    fmt = '<{}B'.format(bl)
    # ...

And so on. 等等。

And of course you'll want an unpack function, like jbatista's from the comments: 当然,你会想要一个unpack函数,比如评论中的jbatista:

def unpack_bigint(b):
    b = bytearray(b) # in case you're passing in a bytes/str
    return sum((1 << (bi*8)) * bb for (bi, bb) in enumerate(b))

This is a bit late, but I figured I'd throw my hat in the ring: 这有点晚了,但我想我会戴上帽子:

def inttob64(n):                                                              
    """                                                                       
    Given an integer returns the base64 encoded version of it (no trailing ==)
    """
    parts = []                                                                
    while n:                                                                  
        parts.insert(0,n & limit)                                             
        n >>= 32                                                              
    data = struct.pack('>' + 'L'*len(parts),*parts)                           
    s = base64.urlsafe_b64encode(data).rstrip('=')                            
    return s                                                                  

def b64toint(s):                                                              
    """                                                                       
    Given a string with a base64 encoded value, return the integer representation
    of it                                                                     
    """                                                                       
    data = base64.urlsafe_b64decode(s + '==')                                 
    n = 0                                                                     
    while data:                                                               
        n <<= 32                                                              
        (toor,) = struct.unpack('>L',data[:4])                                
        n |= toor & 0xffffffff                                                
        data = data[4:]                                                       
    return n

These functions turn an arbitrary-sized long number to/from a big-endian base64 representation. 这些函数将一个任意大小的长数字转换为/从big-endian base64表示。

Here is something that may help. 这可能会有所帮助。 Instead of using struct.pack() I am building a string of bytes to encode and then calling the BASE64 encode on that. 我没有使用struct.pack()而是构建一个字节字符串进行编码,然后在其上调用BASE64编码。 I didn't write the decode, but clearly the decode can recover an identical string of bytes and a loop could recover the original value. 我没有编写解码,但显然解码可以恢复相同的字节串,循环可以恢复原始值。 I don't know if you need fixed-size integers (like always 128-bit) and I don't know if you need Big Endian so I left the decoder for you. 我不知道你是否需要固定大小的整数(总是128位),我不知道你是否需要Big Endian,所以我为你留下了解码器。

Also, encode64() and decode64() are from @msc's answer, but modified to work. 此外, encode64()decode64()来自@ msc的答案,但修改为有效。

import base64
import struct

def encode64(n):
  data = struct.pack('<Q', n).rstrip('\x00')
  if len(data)==0:
    data = '\x00'
  s = base64.urlsafe_b64encode(data).rstrip('=')
  return s

def decode64(s):
  data = base64.urlsafe_b64decode(s + '==')
  n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
  return n[0]

def encode(n, big_endian=False):
    lst = []
    while True:
        n, lsb = divmod(n, 0x100)
        lst.append(chr(lsb))
        if not n:
            break
    if big_endian:
        # I have not tested Big Endian mode, and it may need to have
        # some initial zero bytes prepended; like, if the integer is
        # supposed to be a 128-bit integer, and you encode a 1, you
        # would need this to have 15 leading zero bytes.
        initial_zero_bytes = '\x00' * 2
        data = initial_zero_bytes + ''.join(reversed(lst))
    else:
        data = ''.join(lst)
    s = base64.urlsafe_b64encode(data).rstrip('=')
    return s

print encode(1234567890098765432112345678900987654321)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM