Code is taking too much time

Question

I wrote code to arrange numbers after taking user input. The ordering requires that the sum of adjacent numbers is prime. Up until 10 as an input code is working fine. If I go beyond that the system hangs. Please let me know the steps to optimize it

ex input 8
Answer should be: (1, 2, 3, 4, 7, 6, 5, 8)
Code as follows....

import itertools

x = raw_input("please enter a number")
range_x = range(int(x)+1)
del range_x[0]
result = list(itertools.permutations(range_x))
def prime(x):
    for i in xrange(1,x,2):
        if i == 1:
            i = i+1
        if x%i==0 and i < x :
            return False
    else:
        return True

def is_prime(a):
    for i in xrange(len(a)):
        print a
        if i < len(a)-1:
            if prime(a[i]+a[i+1]):
                pass
            else:
                return False
        else:
            return True


for i in xrange(len(result)):
    if i < len(result)-1:
        if is_prime(result[i]):
            print 'result is:'
            print result[i]
            break
    else:
        print 'result is'
        print result[i-1]

Answer 1

This answer is based on @Tim Peters' suggestion about Hamiltonian paths .

There are many possible solutions. To avoid excessive memory consumption for intermediate solutions, a random path can be generated. It also allows to utilize multiple CPUs easily (each cpu generates its own paths in parallel).

import multiprocessing as mp
import sys

def main():
    number = int(sys.argv[1])

    # directed graph, vertices: 1..number (including ends)
    # there is an edge between i and j if (i+j) is prime
    vertices = range(1, number+1)
    G = {} # vertex -> adjacent vertices
    is_prime = sieve_of_eratosthenes(2*number+1)
    for i in vertices:
        G[i] = []
        for j in vertices:
            if is_prime[i + j]:
                G[i].append(j) # there is an edge from i to j in the graph

    # utilize multiple cpus
    q = mp.Queue()
    for _ in range(mp.cpu_count()):
        p = mp.Process(target=hamiltonian_random, args=[G, q])
        p.daemon = True # do not survive the main process
        p.start()
    print(q.get())

if __name__=="__main__":
    main()

where Sieve of Eratosthenes is:

def sieve_of_eratosthenes(limit):
    is_prime = [True]*limit
    is_prime[0] = is_prime[1] = False # zero and one are not primes
    for n in range(int(limit**.5 + .5)):
        if is_prime[n]:
            for composite in range(n*n, limit, n):
                is_prime[composite] = False
    return is_prime

and:

import random

def hamiltonian_random(graph, result_queue):
    """Build random paths until Hamiltonian path is found."""
    vertices = list(graph.keys())
    while True:
        # build random path
        path = [random.choice(vertices)] # start with a random vertice
        while True: # until path can be extended with a random adjacent vertex
            neighbours = graph[path[-1]]
            random.shuffle(neighbours)
            for adjacent_vertex in neighbours:
                if adjacent_vertex not in path:
                    path.append(adjacent_vertex)
                    break
            else: # can't extend path
                break

        # check whether it is hamiltonian
        if len(path) == len(vertices):
            assert set(path) == set(vertices)
            result_queue.put(path) # found hamiltonian path
            return

Example

$ python order-adjacent-prime-sum.py 20

Output

[19, 18, 13, 10, 1, 4, 9, 14, 5, 6, 17, 2, 15, 16, 7, 12, 11, 8, 3, 20]

The output is a random sequence that satisfies the conditions:

it is a permutation of the range from 1 to 20 (including)
the sum of adjacent numbers is prime

Time performance

It takes around 10 seconds on average to get result for n = 900 and extrapolating the time as exponential function, it should take around 20 seconds for n = 1000 :

时间表现（没有固定解决方案）

The image is generated using this code:

import numpy as np
figname = 'hamiltonian_random_noset-noseq-900-900'
Ns, Ts = np.loadtxt(figname+'.xy', unpack=True)

# use polyfit to fit the data
# y = c*a**n
# log y = log (c * a ** n)
# log Ts = log c + Ns * log a
coeffs = np.polyfit(Ns, np.log2(Ts), deg=1)
poly = np.poly1d(coeffs, variable='Ns')

# use curve_fit to fit the data
from scipy.optimize import curve_fit
def func(x, a, c):
    return c*a**x
popt, pcov = curve_fit(func, Ns, Ts)
aa, cc = popt
a, c = 2**coeffs

# plot it
import matplotlib.pyplot as plt
plt.figure()
plt.plot(Ns, np.log2(Ts), 'ko', label='time measurements')
plt.plot(Ns, np.polyval(poly, Ns), 'r-',
         label=r'$time = %.2g\times %.4g^N$' % (c, a))
plt.plot(Ns, np.log2(func(Ns, *popt)), 'b-',
         label=r'$time = %.2g\times %.4g^N$' % (cc, aa))
plt.xlabel('N')
plt.ylabel('log2(time in seconds)')
plt.legend(loc='upper left')
plt.show()

Fitted values:

>>> c*a**np.array([900, 1000])
array([ 11.37200806,  21.56029156])
>>> func([900, 1000], *popt)
array([ 14.1521409 ,  22.62916398])

Answer 2

For posterity ;-), here's one more based on finding a Hamiltonian path. It's Python3 code. As written, it stops upon finding the first path, but can easily be changed to generate all paths. On my box, it finds a solution for all n in 1 through 900 inclusive in about one minute total. For n somewhat larger than 900, it exceeds the maximum recursion depth.

The prime generator ( psieve() ) is vast overkill for this particular problem, but I had it handy and didn't feel like writing another ;-)

The path finder ( ham() ) is a recursive backtracking search, using what's often (but not always) a very effective ordering heuristic: of all the vertices adjacent to the last vertex in the path so far, look first at those with the fewest remaining exits. For example, this is "the usual" heuristic applied to solving Knights Tour problems. In that context, it often finds a tour with no backtracking needed at all. Your problem appears to be a little tougher than that.

def psieve():
    import itertools
    yield from (2, 3, 5, 7)
    D = {}
    ps = psieve()
    next(ps)
    p = next(ps)
    assert p == 3
    psq = p*p
    for i in itertools.count(9, 2):
        if i in D:      # composite
            step = D.pop(i)
        elif i < psq:   # prime
            yield i
            continue
        else:           # composite, = p*p
            assert i == psq
            step = 2*p
            p = next(ps)
            psq = p*p
        i += step
        while i in D:
            i += step
        D[i] = step

def build_graph(n):
    primes = set()
    for p in psieve():
        if p > 2*n:
            break
        else:
            primes.add(p)

    np1 = n+1
    adj = [set() for i in range(np1)]
    for i in range(1, np1):
        for j in range(i+1, np1):
            if i+j in primes:
                adj[i].add(j)
                adj[j].add(i)
    return set(range(1, np1)), adj

def ham(nodes, adj):
    class EarlyExit(Exception):
        pass

    def inner(index):
        if index == n:
            raise EarlyExit
        avail = adj[result[index-1]] if index else nodes
        for i in sorted(avail, key=lambda j: len(adj[j])):
            # Remove vertex i from the graph.  If this isolates
            # more than 1 vertex, no path is possible.
            result[index] = i
            nodes.remove(i)
            nisolated = 0
            for j in adj[i]:
                adj[j].remove(i)
                if not adj[j]:
                    nisolated += 1
                    if nisolated > 1:
                        break
            if nisolated < 2:
                inner(index + 1)
            nodes.add(i)
            for j in adj[i]:
                adj[j].add(i)

    n = len(nodes)
    result = [None] * n
    try:
        inner(0)
    except EarlyExit:
        return result

def solve(n):
    nodes, adj = build_graph(n)
    return ham(nodes, adj)

Answer 3

Dynamic programming, to the rescue:

def is_prime(n):
    return all(n % i != 0 for i in range(2, n))

def order(numbers, current=[]):
    if not numbers:
        return current

    for i, n in enumerate(numbers):
        if current and not is_prime(n + current[-1]):
            continue

        result = order(numbers[:i] + numbers[i + 1:], current + [n])

        if result:
            return result

    return False

result = order(range(500))

for i in range(len(result) - 1):
    assert is_prime(result[i] + result[i + 1])

You can force it to work for even larger lists by increasing the maximum recursion depth.

Answer 4

Here's my take on a solution. As Tim Peters pointed out, this is a Hamiltonian path problem. So the first step is to generate the graph in some form.

Well the zeroth step in this case to generate prime numbers. I'm going to use a sieve, but whatever prime test is fine. We need primes upto 2 * n since that is the largest any two numbers can sum to.

m = 8
n = m + 1 # Just so I don't have to worry about zero indexes and random +/- 1's
primelen = 2 * m
prime = [True] * primelen
prime[0] = prime[1] = False
for i in range(4, primelen, 2):
    prime[i] = False
for i in range(3, primelen, 2):
    if not prime[i]:
        continue
    for j in range(i * i, primelen, i):
        prime[j] = False

Ok, now we can test for primality with prime[i] . Now its easy to make the graph edges. If I have a number i, what numbers can come next. I'll also make use of the fact that i and j have opposite parity.

pairs = [set(j for j in range(i%2+1, n, 2) if prime[i+j])
         for i in range(n)]

So here pairs[i] is set object whose elements are integers j such that i+j is prime.

Now we need to walk the graph. This is really where the time consuming part is and all further optimizations will be done here.

chains = [
    ([], set(range(1, n))
]

chains is going to keep track of the valid paths as we walk them. The first element in the tuple will be your result. The second element is all the unused numbers, or unvisited nodes. The idea is to take one chain out of the queue, take a step down the path and put it back.

while chains:
    chain, unused = chains.pop()

    if not chain:
        # we haven't even started, all unused are valid
        valid_next = unused
    else:
        # We need numbers that are both unused and paired with the last node
        # Using sets makes this easy
        valid_next = unused & pairs[chains[-1]]

    for num in valid_next:
        # Take a step to the new node and add the new path back to chains
        # Reminder, its important not to mutate anything here, always make new objs
        newchain  = chain + [num]
        newunused = unused - set([num])
        chains.append( (newchain, newunused) )

        # are we done?
        if not newunused:
            print newchain
            chains = False

Notice that if there is no valid next step, the path is removed without a replacement.

This is really memory inefficient, but runs in a reasonable time. The biggest performance bottleneck is walking the graph, so the next optimization would be popping and inserting paths in intelligent places to prioritize the most likely paths. It might be helpful to use a collections.deque or different container for your chains in that case.

EDIT

Here is an example of how you can implement your path priority. We will assign each path a score and keep the chains list sorted by this score. For a simple example I will suggest that paths containing "harder to use" nodes are worth more. That is for each step on a path the score will increase by n - len(valid_next) The modified code will look something like this.

import bisect
chains = ...
chains_score = [0]
while chains:
     chain, unused = chains.pop()
     score = chains_score.pop()
     ...

     for num in valid_next:
          newchain = chain + [num]
          newunused = unused - set([num])
          newscore = score + n - len(valid_next)
          index = bisect.bisect(chains_score, newscore)
          chains.insert(index, (newchain, newunused))
          chains_score.insert(index, newscore)

Remember that insertion is O(n) so the overhead of adding this can be rather large. Its worth doing some analysis on your score algorithm to keep the queue length len(chains) managable.

Code is taking too much time

Question

4 answers

solution1
4 2013-12-26 09:07:42

Example

Output

Time performance

solution2
4 2013-12-26 19:05:12

solution3
3 ACCPTED 2013-12-26 06:00:51

solution4
3 2013-12-26 08:09:43

Code is taking too much time

Question

4 answers

solution1 4 2013-12-26 09:07:42

Example

Output

Time performance

solution2 4 2013-12-26 19:05:12

solution3 3 ACCPTED 2013-12-26 06:00:51

solution4 3 2013-12-26 08:09:43

solution1
4 2013-12-26 09:07:42

solution2
4 2013-12-26 19:05:12

solution3
3 ACCPTED 2013-12-26 06:00:51

solution4
3 2013-12-26 08:09:43