简体   繁体   English

快速查找2的幂数位

[英]Finding digits in powers of 2 fast

The task is to search every power of two below 2^10000, returning the index of the first power in which a string is contained. 任务是搜索2 ^ 10000以下的每个2的幂,返回包含字符串的第一个幂的索引。 For example if the given string to search for is "7" the program will output 15, as 2^15 is the first power to contain 7 in it. 例如,如果要搜索的给定字符串是“7”,则程序将输出15,因为2 ^ 15是其中包含7的第一个幂。

I have approached this with a brute force attempt which times out on ~70% of test cases. 我通过蛮力尝试来解决这个问题,大约70%的测试用例超时。

for i in range(1,9999):
    if search in str(2**i):
        print i
        break

How would one approach this with a time limit of 5 seconds? 如何以5秒的时间限制接近这个?

Try not to compute 2^i at each step. 尽量不要在每一步计算2^i

pow = 1
for i in xrange(1,9999):
    if search in str(pow):
        print i
        break
    pow *= 2

You can compute it as you go along. 您可以随时计算它。 This should save a lot of computation time. 这应该可以节省大量的计算时间。

Using xrange will prevent a list from being built, but that will probably not make much of a difference here. 使用xrange将阻止列表的构建,但这可能不会产生太大的影响。

in is probably implemented as a quadratic string search algorithm. in可能实现为二次字符串搜索算法。 It may (or may not, you'd have to test) be more efficient to use something like KMP for string searching. 它可能(或可能不会,你必须测试)更有效地使用像KMP这样的字符串搜索。

A faster approach could be computing the numbers directly in decimal 更快的方法是直接以十进制计算数字

def double(x):
    carry = 0
    for i, v in enumerate(x):
        d = v*2 + carry
        if d > 99999999:
            x[i] = d - 100000000
            carry = 1
        else:
            x[i] = d
            carry = 0
    if carry:
        x.append(carry)

Then the search function can become 然后搜索功能就可以了

def p2find(s):
    x = [1]
    for y in xrange(10000):
        if s in str(x[-1])+"".join(("00000000"+str(y))[-8:]
                                   for y in x[::-1][1:]):
            return y
        double(x)
    return None

Note also that the digits of all powers of two up to 2^10000 are just 15 millions, and searching the static data is much faster. 另请注意,所有2的最大2 ^ 10000的幂的数字仅为15百万,并且搜索静态数据要快得多。 If the program must not be restarted each time then 如果每次都不能重新启动程序

def p2find(s, digits = []):
    if len(digits) == 0:
        # This precomputation happens only ONCE
        p = 1
        for k in xrange(10000):
            digits.append(str(p))
            p *= 2
    for i, v in enumerate(digits):
        if s in v: return i
    return None

With this approach the first check will take some time, next ones will be very very fast. 使用这种方法,第一次检查将花费一些时间,接下来的检查将非常快。

Compute every power of two and build a suffix tree using each string. 计算每个2的幂,并使用每个字符串构建后缀树。 This is linear time in the size of all the strings. 这是所有字符串大小的线性时间。 Now, the lookups are basically linear time in the length of each lookup string. 现在,查找基本上是每个查找字符串长度的线性时间。

I don't think you can beat this for computational complexity. 我认为你不能因为计算复杂性而打败这个。

There are only 10000 numbers. 只有10000个号码。 You don't need any complex algorithms. 您不需要任何复杂的算法。 Simply calculated them in advance and do search. 只需提前计算并进行搜索。 This should take merely 1 or 2 seconds. 这应该只需要1或2秒。

powers_of_2 = [str(1<<i) for i in range(10000)]

def search(s):
    for i in range(len(powers_of_2)):
        if s in powers_of_2[i]:
            return i

Try this 尝试这个

twos = []
twoslen = []
two = 1
for i in xrange(10000):
    twos.append(two)
    twoslen.append(len(str(two)))
    two *= 2

tens = []
ten = 1
for i in xrange(len(str(two))):
    tens.append(ten)
    ten *= 10

s = raw_input()
l = len(s)
n = int(s)

for i in xrange(len(twos)):
    for j in xrange(twoslen[i]):
        k = twos[i] / tens[j]
        if k < n: continue
        if (k - n) % tens[l] == 0:
            print i
            exit()

The idea is to precompute every power of 2, 10 and and also to precompute the number of digits for every power of 2. In this way the problem is reduces to finding the minimum i for which there exist aj such that after removing the last j digits from 2 ** i you obtain a number which ends with n or expressed as a formula (2 ** i / 10 ** j - n) % 10 ** len(str(n)) == 0. 这个想法是预先计算2,10的每个幂,并且还预先计算每个2的幂的位数。这样,​​问题就减少到找到存在aj的最小i,使得在删除最后的j之后来自2 **的数字我得到一个以n结尾的数字或表示为公式(2 ** i / 10 ** j - n)%10 ** len(str(n))== 0。

A big problem here is that converting a binary integer to decimal notation takes time quadratic in the number of bits (at least in the straightforward way Python does it). 这里的一个大问题是将二进制整数转换为十进制表示法需要时间二次方的位数(至少以Python的直接方式)。 It's actually faster to fake your own decimal arithmetic, as @6502 did in his answer. 伪造自己的十进制算术实际上更快,正如@ 6502在他的回答中所做的那样。

But it's very much faster to let Python's decimal module do it - at least under Python 3.3.2 (I don't know how much C acceleration is built in to Python decimal versions before that). 但它是非常快很多,让Python的decimal模块做到这一点-至少在Python的3.3.2(我不知道有多少C语言加速内置于Python的decimal版本之前)。 Here's code: 这是代码:

class S:
    def __init__(self):
        import decimal
        decimal.getcontext().prec = 4000  # way more than enough for 2**10000
        p2 = decimal.Decimal(1)
        full = []
        for i in range(10000):
            s = "%s<%s>" % (p2, i)
            ##assert s == "%s<%s>" % (str(2**i), i)
            full.append(s)
            p2 *= 2
        self.full = "".join(full)

    def find(self, s):
        import re
        pat = s + "[^<>]*<(\d+)>"
        m = re.search(pat, self.full)
        if m:
            return int(m.group(1))
        else:
            print(s, "not found!")

and sample usage: 和样品用法:

>>> s = S()
>>> s.find("1")
0
>>> s.find("2")
1
>>> s.find("3")
5
>>> s.find("65")
16
>>> s.find("7")
15
>>> s.find("00000")
1491
>>> s.find("666")
157
>>> s.find("666666")
2269
>>> s.find("66666666")
66666666 not found!

s.full is a string with a bit over 15 million characters. s.full是一个超过1500万个字符的字符串。 It looks like this: 它看起来像这样:

>>> print(s.full[:20], "...", s.full[-20:])
1<0>2<1>4<2>8<3>16<4 ... 52396298354688<9999>

So the string contains each power of 2, with the exponent following a power enclosed in angle brackets. 因此,字符串包含2的每个幂,指数跟随尖括号括起来。 The find() method constructs a regular expression to search for the desired substring, then look ahead to find the power. find()方法构造一个正则表达式来搜索所需的子字符串,然后向前看以找到电源。

Playing around with this, I'm convinced that just about any way of searching is "fast enough". 玩弄这个,我确信几乎任何搜索方式都“足够快”。 It's getting the decimal representations of the large powers that sucks up the vast bulk of the time. 它获得了大部分时间吸收的大国的十进制表示。 And the decimal module solves that one. decimal模块解决了这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM