简体   繁体   English

代码花费了太多时间

[英]Code is taking too much time

I wrote code to arrange numbers after taking user input. 我在编写用户输入后编写代码来安排数字。 The ordering requires that the sum of adjacent numbers is prime. 排序要求相邻数字的总和为素数。 Up until 10 as an input code is working fine. 直到10,因为输入代码工作正常。 If I go beyond that the system hangs. 如果我超越了那个系统就会挂起。 Please let me know the steps to optimize it 请让我知道优化它的步骤

ex input 8 输入8
Answer should be: (1, 2, 3, 4, 7, 6, 5, 8) 答案应该是:(1,2,3,4,7,6,5,8)
Code as follows.... 代码如下....

import itertools

x = raw_input("please enter a number")
range_x = range(int(x)+1)
del range_x[0]
result = list(itertools.permutations(range_x))
def prime(x):
    for i in xrange(1,x,2):
        if i == 1:
            i = i+1
        if x%i==0 and i < x :
            return False
    else:
        return True

def is_prime(a):
    for i in xrange(len(a)):
        print a
        if i < len(a)-1:
            if prime(a[i]+a[i+1]):
                pass
            else:
                return False
        else:
            return True


for i in xrange(len(result)):
    if i < len(result)-1:
        if is_prime(result[i]):
            print 'result is:'
            print result[i]
            break
    else:
        print 'result is'
        print result[i-1]

This answer is based on @Tim Peters' suggestion about Hamiltonian paths . 这个答案是基于@Tim Peters关于汉密尔顿路径的建议

There are many possible solutions. 有许多可能的解决方案。 To avoid excessive memory consumption for intermediate solutions, a random path can be generated. 为了避免中间解决方案的过多内存消耗,可以生成随机路径。 It also allows to utilize multiple CPUs easily (each cpu generates its own paths in parallel). 它还允许轻松利用多个CPU(每个cpu并行生成自己的路径)。

import multiprocessing as mp
import sys

def main():
    number = int(sys.argv[1])

    # directed graph, vertices: 1..number (including ends)
    # there is an edge between i and j if (i+j) is prime
    vertices = range(1, number+1)
    G = {} # vertex -> adjacent vertices
    is_prime = sieve_of_eratosthenes(2*number+1)
    for i in vertices:
        G[i] = []
        for j in vertices:
            if is_prime[i + j]:
                G[i].append(j) # there is an edge from i to j in the graph

    # utilize multiple cpus
    q = mp.Queue()
    for _ in range(mp.cpu_count()):
        p = mp.Process(target=hamiltonian_random, args=[G, q])
        p.daemon = True # do not survive the main process
        p.start()
    print(q.get())

if __name__=="__main__":
    main()

where Sieve of Eratosthenes is: Eratosthenes的Sieve是:

def sieve_of_eratosthenes(limit):
    is_prime = [True]*limit
    is_prime[0] = is_prime[1] = False # zero and one are not primes
    for n in range(int(limit**.5 + .5)):
        if is_prime[n]:
            for composite in range(n*n, limit, n):
                is_prime[composite] = False
    return is_prime

and: 和:

import random

def hamiltonian_random(graph, result_queue):
    """Build random paths until Hamiltonian path is found."""
    vertices = list(graph.keys())
    while True:
        # build random path
        path = [random.choice(vertices)] # start with a random vertice
        while True: # until path can be extended with a random adjacent vertex
            neighbours = graph[path[-1]]
            random.shuffle(neighbours)
            for adjacent_vertex in neighbours:
                if adjacent_vertex not in path:
                    path.append(adjacent_vertex)
                    break
            else: # can't extend path
                break

        # check whether it is hamiltonian
        if len(path) == len(vertices):
            assert set(path) == set(vertices)
            result_queue.put(path) # found hamiltonian path
            return

Example

$ python order-adjacent-prime-sum.py 20

Output 产量

[19, 18, 13, 10, 1, 4, 9, 14, 5, 6, 17, 2, 15, 16, 7, 12, 11, 8, 3, 20]

The output is a random sequence that satisfies the conditions: 输出是满足条件的随机序列:

  • it is a permutation of the range from 1 to 20 (including) 它是1到20(包括)范围内的排列
  • the sum of adjacent numbers is prime 相邻数字的总和是素数

Time performance 时间表现

It takes around 10 seconds on average to get result for n = 900 and extrapolating the time as exponential function, it should take around 20 seconds for n = 1000 : 平均需要大约10秒才能获得n = 900结果并将时间外推为指数函数, n = 1000需要大约20秒:

时间表现(没有固定解决方案)

The image is generated using this code: 使用以下代码生成图像:

import numpy as np
figname = 'hamiltonian_random_noset-noseq-900-900'
Ns, Ts = np.loadtxt(figname+'.xy', unpack=True)

# use polyfit to fit the data
# y = c*a**n
# log y = log (c * a ** n)
# log Ts = log c + Ns * log a
coeffs = np.polyfit(Ns, np.log2(Ts), deg=1)
poly = np.poly1d(coeffs, variable='Ns')

# use curve_fit to fit the data
from scipy.optimize import curve_fit
def func(x, a, c):
    return c*a**x
popt, pcov = curve_fit(func, Ns, Ts)
aa, cc = popt
a, c = 2**coeffs

# plot it
import matplotlib.pyplot as plt
plt.figure()
plt.plot(Ns, np.log2(Ts), 'ko', label='time measurements')
plt.plot(Ns, np.polyval(poly, Ns), 'r-',
         label=r'$time = %.2g\times %.4g^N$' % (c, a))
plt.plot(Ns, np.log2(func(Ns, *popt)), 'b-',
         label=r'$time = %.2g\times %.4g^N$' % (cc, aa))
plt.xlabel('N')
plt.ylabel('log2(time in seconds)')
plt.legend(loc='upper left')
plt.show()

Fitted values: 适合的价值观:

>>> c*a**np.array([900, 1000])
array([ 11.37200806,  21.56029156])
>>> func([900, 1000], *popt)
array([ 14.1521409 ,  22.62916398])

For posterity ;-), here's one more based on finding a Hamiltonian path. 对于后代;-),这里还有一个基于找到汉密尔顿主义的道路。 It's Python3 code. 这是Python3代码。 As written, it stops upon finding the first path, but can easily be changed to generate all paths. 如上所述,它在找到第一条路径时停止,但可以轻松更改以生成所有路径。 On my box, it finds a solution for all n in 1 through 900 inclusive in about one minute total. 在我的方框中,它找到了一个解决方案,包括1到900之间的所有n ,总共大约一分钟。 For n somewhat larger than 900, it exceeds the maximum recursion depth. 对于大于900的n ,它超过了最大递归深度。

The prime generator ( psieve() ) is vast overkill for this particular problem, but I had it handy and didn't feel like writing another ;-) 对于这个特殊的问题,素数发生器( psieve() )是非常矫枉过正的,但我把它psieve()很方便而且不想写另一个;-)

The path finder ( ham() ) is a recursive backtracking search, using what's often (but not always) a very effective ordering heuristic: of all the vertices adjacent to the last vertex in the path so far, look first at those with the fewest remaining exits. 路径查找器( ham() )是一种递归回溯搜索,使用通常 (但不总是)非常有效的排序启发式:到目前为止,在路径中最后一个顶点相邻的所有顶点中,首先查看最少的那些剩下的出口。 For example, this is "the usual" heuristic applied to solving Knights Tour problems. 例如,这是解决Knights Tour问题的“通常”启发式算法。 In that context, it often finds a tour with no backtracking needed at all. 在这种情况下,它经常会找到一个根本不需要回溯的巡回演出。 Your problem appears to be a little tougher than that. 你的问题看起来有点困难。

def psieve():
    import itertools
    yield from (2, 3, 5, 7)
    D = {}
    ps = psieve()
    next(ps)
    p = next(ps)
    assert p == 3
    psq = p*p
    for i in itertools.count(9, 2):
        if i in D:      # composite
            step = D.pop(i)
        elif i < psq:   # prime
            yield i
            continue
        else:           # composite, = p*p
            assert i == psq
            step = 2*p
            p = next(ps)
            psq = p*p
        i += step
        while i in D:
            i += step
        D[i] = step

def build_graph(n):
    primes = set()
    for p in psieve():
        if p > 2*n:
            break
        else:
            primes.add(p)

    np1 = n+1
    adj = [set() for i in range(np1)]
    for i in range(1, np1):
        for j in range(i+1, np1):
            if i+j in primes:
                adj[i].add(j)
                adj[j].add(i)
    return set(range(1, np1)), adj

def ham(nodes, adj):
    class EarlyExit(Exception):
        pass

    def inner(index):
        if index == n:
            raise EarlyExit
        avail = adj[result[index-1]] if index else nodes
        for i in sorted(avail, key=lambda j: len(adj[j])):
            # Remove vertex i from the graph.  If this isolates
            # more than 1 vertex, no path is possible.
            result[index] = i
            nodes.remove(i)
            nisolated = 0
            for j in adj[i]:
                adj[j].remove(i)
                if not adj[j]:
                    nisolated += 1
                    if nisolated > 1:
                        break
            if nisolated < 2:
                inner(index + 1)
            nodes.add(i)
            for j in adj[i]:
                adj[j].add(i)

    n = len(nodes)
    result = [None] * n
    try:
        inner(0)
    except EarlyExit:
        return result

def solve(n):
    nodes, adj = build_graph(n)
    return ham(nodes, adj)

Dynamic programming, to the rescue: 动态编程,救援:

def is_prime(n):
    return all(n % i != 0 for i in range(2, n))

def order(numbers, current=[]):
    if not numbers:
        return current

    for i, n in enumerate(numbers):
        if current and not is_prime(n + current[-1]):
            continue

        result = order(numbers[:i] + numbers[i + 1:], current + [n])

        if result:
            return result

    return False

result = order(range(500))

for i in range(len(result) - 1):
    assert is_prime(result[i] + result[i + 1])

You can force it to work for even larger lists by increasing the maximum recursion depth. 您可以通过增加最大递归深度来强制它适用于更大的列表。

Here's my take on a solution. 这是我对解决方案的看法。 As Tim Peters pointed out, this is a Hamiltonian path problem. 蒂姆彼得斯指出,这是一个汉密尔顿路径问题。 So the first step is to generate the graph in some form. 所以第一步是以某种形式生成图形。

Well the zeroth step in this case to generate prime numbers. 在这种情况下,第0步生成素数。 I'm going to use a sieve, but whatever prime test is fine. 我打算用筛子,但无论什么样的质量测试都没关系。 We need primes upto 2 * n since that is the largest any two numbers can sum to. 我们需要素数高达2 * n因为这是任何两个数字可以求和的最大值。

m = 8
n = m + 1 # Just so I don't have to worry about zero indexes and random +/- 1's
primelen = 2 * m
prime = [True] * primelen
prime[0] = prime[1] = False
for i in range(4, primelen, 2):
    prime[i] = False
for i in range(3, primelen, 2):
    if not prime[i]:
        continue
    for j in range(i * i, primelen, i):
        prime[j] = False

Ok, now we can test for primality with prime[i] . 好的,现在我们可以用prime[i]测试素数。 Now its easy to make the graph edges. 现在很容易使图形边缘化。 If I have a number i, what numbers can come next. 如果我有一个数字i,接下来会有什么数字。 I'll also make use of the fact that i and j have opposite parity. 我还将利用i和j具有相反奇偶性的事实。

pairs = [set(j for j in range(i%2+1, n, 2) if prime[i+j])
         for i in range(n)]

So here pairs[i] is set object whose elements are integers j such that i+j is prime. 所以这里pairs[i]是设置对象,其元素是整数j,使得i+j是素数。

Now we need to walk the graph. 现在我们需要走图表。 This is really where the time consuming part is and all further optimizations will be done here. 这真的是耗时的部分,所有进一步的优化都将在这里完成。

chains = [
    ([], set(range(1, n))
]

chains is going to keep track of the valid paths as we walk them. 当我们走路时, chains将跟踪有效路径。 The first element in the tuple will be your result. 元组中的第一个元素是你的结果。 The second element is all the unused numbers, or unvisited nodes. 第二个元素是所有未使用的数字或未访问的节点。 The idea is to take one chain out of the queue, take a step down the path and put it back. 我们的想法是将一条链从队列中取出,沿着路径向下走,然后将其放回原处。

while chains:
    chain, unused = chains.pop()

    if not chain:
        # we haven't even started, all unused are valid
        valid_next = unused
    else:
        # We need numbers that are both unused and paired with the last node
        # Using sets makes this easy
        valid_next = unused & pairs[chains[-1]]

    for num in valid_next:
        # Take a step to the new node and add the new path back to chains
        # Reminder, its important not to mutate anything here, always make new objs
        newchain  = chain + [num]
        newunused = unused - set([num])
        chains.append( (newchain, newunused) )

        # are we done?
        if not newunused:
            print newchain
            chains = False

Notice that if there is no valid next step, the path is removed without a replacement. 请注意,如果没有有效的下一步,则删除路径而不进行替换。

This is really memory inefficient, but runs in a reasonable time. 这实际上是内存效率低下的,但在合理的时间内运行。 The biggest performance bottleneck is walking the graph, so the next optimization would be popping and inserting paths in intelligent places to prioritize the most likely paths. 最大的性能瓶颈在于走图表,因此下一个优化将是在智能位置弹出和插入路径,以优先考虑最可能的路径。 It might be helpful to use a collections.deque or different container for your chains in that case. 在这种情况下,为您的链使用collections.deque或不同的容器可能会有所帮助。

EDIT 编辑

Here is an example of how you can implement your path priority. 以下是如何实现路径优先级的示例。 We will assign each path a score and keep the chains list sorted by this score. 我们将分配给每个路径的得分,并保持chains通过这个得分排序列表。 For a simple example I will suggest that paths containing "harder to use" nodes are worth more. 举一个简单的例子,我将建议包含“难以使用”节点的路径值得更多。 That is for each step on a path the score will increase by n - len(valid_next) The modified code will look something like this. 这是对于路径上的每一步,得分将增加n - len(valid_next)修改后的代码看起来像这样。

import bisect
chains = ...
chains_score = [0]
while chains:
     chain, unused = chains.pop()
     score = chains_score.pop()
     ...

     for num in valid_next:
          newchain = chain + [num]
          newunused = unused - set([num])
          newscore = score + n - len(valid_next)
          index = bisect.bisect(chains_score, newscore)
          chains.insert(index, (newchain, newunused))
          chains_score.insert(index, newscore)

Remember that insertion is O(n) so the overhead of adding this can be rather large. 请记住,插入是O(n)因此添加此内容的开销可能相当大。 Its worth doing some analysis on your score algorithm to keep the queue length len(chains) managable. 值得对你的得分算法做一些分析,以保持队列长度len(chains)管理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM