简体   繁体   English

有没有更快的方法将数字转换为名称?

[英]Is there a faster way of converting a number to a name?

The following code defines a sequence of names that are mapped to numbers. 以下代码定义了映射到数字的名称序列。 It is designed to take a number and retrieve a specific name. 它旨在获取一个数字并检索特定名称。 The class operates by ensuring the name exists in its cache, and then returns the name by indexing into its cache. 该类通过确保名称存在于其缓存中来运行,然后通过索引将其返回到其缓存中来返回名称。 The question in this: how can the name be calculated based on the number without storing a cache? 问题在于: 如何在不存储缓存的情况下根据数量计算名称?

The name can be thought of as a base 63 number, except for the first digit which is always in base 53. 该名称可以被认为是基数63,除了始终在基数53的第一个数字。

class NumberToName:

    def __generate_name():
        def generate_tail(length):
            if length > 0:
                for char in NumberToName.CHARS:
                    for extension in generate_tail(length - 1):
                        yield char + extension
            else:
                yield ''
        for length in itertools.count():
            for char in NumberToName.FIRST:
                for extension in generate_tail(length):
                    yield char + extension

    FIRST = ''.join(sorted(string.ascii_letters + '_'))
    CHARS = ''.join(sorted(string.digits + FIRST))
    CACHE = []
    NAMES = __generate_name()

    @classmethod
    def convert(cls, number):
        for _ in range(number - len(cls.CACHE) + 1):
            cls.CACHE.append(next(cls.NAMES))
        return cls.CACHE[number]

    def __init__(self, *args, **kwargs):
        raise NotImplementedError()

The following interactive sessions show some of the values that are expected to be returned in order. 以下交互式会话显示了一些预期按顺序返回的值。

>>> NumberToName.convert(0)
'A'
>>> NumberToName.convert(26)
'_'
>>> NumberToName.convert(52)
'z'
>>> NumberToName.convert(53)
'A0'
>>> NumberToName.convert(1692)
'_1'
>>> NumberToName.convert(23893)
'FAQ'

Unfortunately, these numbers need to be mapped to these exact names (to allow a reverse conversion). 不幸的是,这些数字需要映射到这些确切的名称(以允许反向转换)。


Please note: A variable number of bits are received and converted unambiguously into a number. 请注意:接收可变数量的位并将其明确转换为数字。 This number should be converted unambiguously to a name in the Python identifier namespace. 应将此数字明确转换为Python标识名称空间中的名称。 Eventually, valid Python names will be converted to numbers, and these numbers will be converted to a variable number of bits. 最终,有效的Python名称将转换为数字,这些数字将转换为可变数量的位。


Final solution: 最终解决方案

import string

HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)

def convert_number_to_name(number):
    if number < HEAD_BASE: return HEAD_CHAR[number]
    q, r = divmod(number - HEAD_BASE, TAIL_BASE)
    return convert_number_to_name(q) + TAIL_CHAR[r]

This is a fun little problem full of off by 1 errors. 这是一个充满1个错误的有趣小问题。

Without loops: 没有循环:

import string

first_digits = sorted(string.ascii_letters + '_')
rest_digits = sorted(string.digits + string.ascii_letters + '_')

def convert(number):
    if number < len(first_digits):
        return first_digits[number]

    current_base = len(rest_digits)
    remain = number - len(first_digits)
    return convert(remain / current_base) + rest_digits[remain % current_base]

And the tests: 测试:

print convert(0)
print convert(26)
print convert(52)
print convert(53)
print convert(1692)
print convert(23893)

Output: 输出:

A
_
z
A0
_1
FAQ

What you've got is a corrupted form of bijective numeration (the usual example being spreadsheet column names, which are bijective base-26). 你得到的是一种损坏的双射数字形式(通常的例子是电子表格列名,它是双射基数 - 26)。

One way to generate bijective numeration: 生成双射计数的一种方法:

def bijective(n, digits=string.ascii_uppercase):
    result = []
    while n > 0:
        n, mod = divmod(n - 1, len(digits))
        result += digits[mod]
    return ''.join(reversed(result))

All you need to do is supply a different set of digits for the case where 53 >= n > 0 . 您需要做的就是为53 >= n > 0的情况提供一组不同的数字。 You will also need to increment n by 1, as properly the bijective 0 is the empty string, not "A" : 您还需要将n递增1,因为正确的双射0是空字符串,而不是"A"

def name(n, first=sorted(string.ascii_letters + '_'), digits=sorted(string.ascii_letters + '_' + string.digits)):
    result = []
    while n >= len(first):
        n, mod = divmod(n - len(first), len(digits))
        result += digits[mod]
    result += first[n]
    return ''.join(reversed(result))

Tested for the first 10,000 names: 测试前10,000个名称:

first_chars = sorted(string.ascii_letters + '_')
later_chars = sorted(list(string.digits) + first_chars)

def f(n):
    # first, determine length by subtracting the number of items of length l
    # also determines the index into the list of names of length l
    ix = n
    l = 1
    while ix >= 53 * (63 ** (l-1)):
        ix -= 53 * (63 ** (l-1))
        l += 1

    # determine first character
    first = first_chars[ix // (63 ** (l-1))]

    # rest of string is just a base 63 number
    s = ''
    rem = ix % (63 ** (l-1))
    for i in range(l-1):
        s = later_chars[rem % 63] + s
        rem //= 63

    return first+s

You can use the code in this answer to the question "Base 62 conversion in Python" (or perhaps one of the other answers). 您可以使用答案中的代码“Python中的Base 62转换”(或者可能是其他答案之一)。

Using the referenced code, I think the answer your real question which was " how can the name be calculated based on the number without storing a cache? " would be to make the name the simple base 62 conversion of the number possibly with a leading underscore if the first character of the name is a digit (which is simply ignored when converting the name back into a number). 使用引用的代码,我认为你的真实问题的答案是“ 如何根据数字计算名称而不存储缓存? ”将使名称成为简单的基数62转换数字可能带有前导下划线如果名称的第一个字符是一个数字(在将名称转换回数字时会被忽略)。

Here's sample code illustrating what I propose: 这是示例代码,说明了我的建议:

from base62 import base62_encode, base62_decode

def NumberToName(num):
    ret = base62_encode(num)
    return ('_' + ret) if ret[0] in '0123456789' else ret

def NameToNumber(name):
    return base62_decode(name if name[0] is not '_' else name[1:])

if __name__ == '__main__':
    def test(num):
        name = NumberToName(num)
        num2 = NameToNumber(name)
        print 'NumberToName({0:5d}) -> {1!r:>6s}, NameToNumber({2!r:>6s}) -> {3:5d}' \
              .format(num, name, name, num2)

    test(26)
    test(52)
    test(53)
    test(1692)
    test(23893)

Output: 输出:

NumberToName(   26) ->    'q', NameToNumber(   'q') ->    26
NumberToName(   52) ->    'Q', NameToNumber(   'Q') ->    52
NumberToName(   53) ->    'R', NameToNumber(   'R') ->    53
NumberToName( 1692) ->   'ri', NameToNumber(  'ri') ->  1692
NumberToName(23893) -> '_6dn', NameToNumber('_6dn') -> 23893

If the numbers could be negative, you might have to modify the code from the referenced answer (and there is some discussion there on how to do it). 如果数字可能是负数,您可能必须修改引用答案中的代码(并且有关于如何执行此操作的讨论)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM