简体   繁体   中英

Python - How to decrease big O and increase efficiency in multiple nested for loops?

I wrote a python script that calculates all possibilities where the following conditions are met:

  1. a^(2) + b^(2) + c^(2) + d^(2) + e^(2) = f^(2)
  2. a,b,c,d,e,f are distinct and nonzero integers
  3. a,b,c,d,e are even numbers between twin primes (eg 11 & 13 are twin primes, so 12 is a valid possibility)
  4. f ≤ 65535
  5. the sum of the digits of a == the sum of the digits of b == the sum of the digits of c == the sum of the digits of d == the sum of the digits of e == the sum of the digits of f

I'm not positive whether there will be any results when including criteria 5, but I'd like to find out in a timely manner at least. Ideally, the following conditions should also be met:

  1. results that use the same values for a,b,c,d,e,f but in a different order should not be in the results; ideally they should be excluded from the for loops as well
  2. results should be sorted by lowest value of a first, lowest value of b first and so and so forth

My question would be, how can I decrease the operating time and increase efficiency?

import itertools
import time

start_time = time.time()

def is_prime(n):
    for i in range(2, n):
        if n % i == 0:
            return False
    return True

def generate_twin_primes(start, end):
    for i in range(start, end):
        j = i + 2
        if(is_prime(i) and is_prime(j)):
            n = text_file2.write(str(i+1) + '\n')

def sum_digits(n):
   r = 0
   while n:
       r, n = r + n % 10, n // 10
   return r

def is_sorted(vals):
    for i in range(len(vals)-2):
        if vals[i] < vals[i+1]:
            return False
    return True

def pythagorean_sixlet():
    valid = []
    for a in x:
        for b in x:
            for c in x:
                for d in x:
                    for e in x:
                        f = (a * a + b * b + c * c + d * d + e * e)**(1/2)
                        if f % 1 == 0 and all(x[0]!=x[1] for x in list(itertools.combinations([a, b, c, d, e], 2))):
                            if sum_digits(a) == sum_digits(b) == sum_digits(c) == sum_digits(d) == sum_digits(e) == sum_digits(int(f)):
                                valid.append([a, b, c, d, e, int(f)])
    for valid_entry in valid:
        if is_sorted(valid_entry):
            with open("output.txt", "a") as text_file1:
                text_file1.write(str(valid_entry[0]) + " " + str(valid_entry[1]) + " " + str(valid_entry[2]) + " " + str(valid_entry[3]) + " " + str(valid_entry[4]) + " | " + str(valid_entry[5]) + '\n')
                text_file1.close()

#input #currently all even numbers between twin primes under 1000
text_file2 = open("input.txt", "w")
generate_twin_primes(2, 1000)
text_file2.close()

# counting number of lines in "input.txt" and calculating number of potential possibilities to go through
count = 0
fname = "input.txt"
with open(fname, 'r') as f:
    for line in f:
        count += 1
print("Number of lines:", count)
print("Number of potential possibilites:", count**5)

with open('input.txt', 'r') as f:
    x = f.read().splitlines()
    x = [int(px) for px in x]

pythagorean_sixlet()
print("--- %s seconds ---" % (time.time() - start_time))

Well, this smells a lot like a HW problem, so we can't give away the farm too easy... :)

A couple things to consider:

  1. if you want to check unique combinations, the number of possibilities is reduced a good chunk from count**5 , right?
  2. You are doing all of your checking at the inner-most part of the loop. Can you do some checking along the way so that you don't have to generate and test all of the possibilities, which is "expensive."
  3. If you do choose to keep your check for uniqueness in the inner portion, find a better way that making all the combinations...that is wayyyyy expensive. Hint: If you made a set of the numbers you have, what would it tell you?

Implementing some of the above:

Number of candidate twin primes between [2, 64152]: 846

total candidate solutions: 1795713740 [need to check f for these]
elapsed:  5.957056045532227

size of result set: 27546
20 random selections from the group:
(40086.0, [3852, 4482, 13680, 20808, 30852])
(45774.0, [6552, 10458, 17028, 23832, 32940])
(56430.0, [1278, 13932, 16452, 27108, 44532])
(64746.0, [15732, 17208, 20772, 32562, 46440])
(47610.0, [3852, 9432, 22158, 24372, 32832])
(53046.0, [3852, 17208, 20772, 23058, 39240])
(36054.0, [4518, 4932, 16452, 21492, 22860])
(18396.0, [3258, 4518, 5742, 9342, 13680])
(45000.0, [2970, 10890, 16650, 18540, 35730])
(59976.0, [2970, 9342, 20772, 35802, 42282])
(42246.0, [3528, 5652, 17208, 25308, 28350])
(39870.0, [3528, 7308, 16362, 23292, 26712])
(64656.0, [8820, 13932, 16452, 36108, 48312])
(61200.0, [198, 882, 22158, 35532, 44622])
(55350.0, [3168, 3672, 5652, 15732, 52542])
(14526.0, [1278, 3528, 7128, 7560, 9432])
(65106.0, [5652, 30852, 31248, 32832, 34650])
(63612.0, [2088, 16830, 26730, 33750, 43650])
(42066.0, [2088, 13932, 15642, 23832, 27540])
(31950.0, [828, 3582, 13932, 16452, 23292])
--- 2872.701852083206 seconds ---
[Finished in 2872.9s]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM