简体   繁体   English

优化素数Python代码

[英]Optimizing Prime Number Python Code

I'm relatively new to the python world, and the coding world in general, so I'm not really sure how to go about optimizing my python script. 我对python世界和一般的编码世界还比较陌生,因此我不确定如何优化python脚本。 The script that I have is as follows: 我拥有的脚本如下:

import math
z = 1
x = 0
while z != 0:
    x = x+1
    if x == 500:
        z = 0
    calculated = open('Prime_Numbers.txt', 'r')
    readlines = calculated.readlines()
    calculated.close()
    a = len(readlines)
    b = readlines[(a-1)]

    b = int(b) + 1
    for num in range(b, (b+1000)):
        prime = True
        calculated = open('Prime_Numbers.txt', 'r')
        for i in calculated:
            i = int(i)
            q = math.ceil(num/2)
            if (q%i==0):
                prime = False
        if prime:
            calculated.close()
            writeto = open('Prime_Numbers.txt', 'a')
            num = str(num)
            writeto.write("\n" + num)
            writeto.close()
            print(num)

As some of you can probably guess I'm calculating prime numbers. 你们中有些人可能会猜到我正在计算素数。 The external file that it calls on contains all the prime numbers between 2 and 20. The reason that I've got the while loop in there is that I wanted to be able to control how long it ran for. 它调用的外部文件包含2到20之间的所有素数。之所以要进入while循环,是因为我希望能够控制它运行了多长时间。

If you have any suggestions for cutting out any clutter in there could you please respond and let me know, thanks. 如果您有什么建议可以消除其中的混乱情况,请做出回应,谢谢。

Reading and writing to files is very, very slow compared to operations with integers. 与对整数进行操作相比,对文件的读写非常慢。 Your algorithm can be sped up 100-fold by just ripping out all the file I/O: 您只需删除所有文件I / O,即可将算法加速100倍:

import itertools

primes = {2}  # A set containing only 2

for n in itertools.count(3):  # Start counting from 3, by 1
    for prime in primes:      # For every prime less than n
        if n % prime == 0:    # If it divides n
            break             # Then n is composite
    else:
        primes.add(n)         # Otherwise, it is prime
        print(n)

A much faster prime-generating algorithm would be a sieve. 更快的素数生成算法将是筛子。 Here's the Sieve of Eratosthenes, in Python 3: 这是Python中的Eratosthenes筛子:

end = int(input('Generate primes up to: '))
numbers = {n: True for n in range(2, end)}  # Assume every number is prime, and then

for n, is_prime in numbers.items():         # (Python 3 only)
    if not is_prime:
        continue                            # For every prime number

    for i in range(n ** 2, end, n):         # Cross off its multiples
        numbers[i] = False

    print(n)

It is very inefficient to keep storing and loading all primes from a file. 保持文件中所有素数的存储和加载效率很低。 In general file access is very slow. 通常,文件访问非常慢。 Instead save the primes to a list or deque. 而是将素数保存到列表或双端队列。 For this initialize calculated = deque() and then simply add new primes with calculated.append(num) . 对于此初始化, calculated = deque() ,然后简单地添加新的质数与calculated.append(num) At the same time output your primes with print(num) and pipe the result to a file. 同时输出带有print(num)数,并将结果通过管道传输到文件中。

When you found out that num is not a prime, you do not have to keep checking all the other divisors. 当发现num不是质数时,就不必继续检查所有其他除数。 So break from the inner loop: 因此,请打破内循环:

if q%i == 0:
    prime = False
    break

You do not need to go through all previous primes to check for a new prime. 您不需要检查所有以前的素数就可以检查新的素数。 Since each non-prime needs to factorize into two integers, at least one of the factors has to be smaller or equal sqrt(num) . 由于每个非素数都需要分解为两个整数,因此至少其中一个因子必须小于或等于sqrt(num) So limit your search to these divisors. 因此,将搜索范围限制为这些除数。

Also the first part of your code irritates me. 您的代码的第一部分也使我感到恼火。

z = 1
x = 0
while z != 0:
    x = x+1
    if x == 500:
        z = 0

This part seems to do the same as: 这部分似乎与以下内容相同:

for x in range(500):

Also you limit with x to 500 primes, why don't you simply use a counter instead, that you increase if a prime is found and check for at the same time, breaking if the limit is reached? 另外,您将x限制为500个质数,为什么不简单地使用一个计数器,如果找到质数则增加计数并同时进行检查,如果达到限制则中断? This would be more readable in my opinion. 我认为这将更具可读性。

In general you do not need to introduce a limit. 通常,您不需要引入限制。 You can simply abort the program at any point in time by hitting Ctrl+C . 您可以随时按Ctrl+C来中止程序。

However, as others already pointed out, your chosen algorithm will perform very poor for medium or large primes. 但是,正如其他人已经指出的那样,对于中等或较大的素数,您选择的算法的性能将非常差。 There are more efficient algorithms to find prime numbers: https://en.wikipedia.org/wiki/Generating_primes , especially https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes . 有更有效的算法来查找素数: https : //en.wikipedia.org/wiki/Generating_primes ,尤其是https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

You're writing a blank line to your file, which is making int() traceback. 您正在将空白行写入文件,这将进行int()追溯。 Also, I'm guessing you need to rstrip() off your newlines. 另外,我猜您需要从换行符中删除rstrip()。

I'd suggest using two different files - one for initial values, and one for all values - initial and recently computed. 我建议使用两个不同的文件-一个用于初始值,一个用于所有值-初始和最近计算。

If you can keep your values in memory a while, that'd be a lot faster than going through a file repeatedly. 如果您可以将值保留在内存中一段时间​​,那将比重复遍历文件快得多。 But of course, this will limit the size of the primes you can compute, so for larger values you might return to the iterate-through-the-file method if you want. 但是,当然,这将限制您可以计算的素数的大小,因此,对于较大的值,您可以根据需要返回到文件迭代方法。

For computing primes of modest size, a sieve is actually quite good, and worth a google. 对于计算大小适中的素数,筛子实际上是相当不错的,并且值得一看。

When you get into larger primes, trial division by the first n primes is good, followed by m rounds of Miller-Rabin. 当您遇到较大的素数时,最好先除以前n个素数,再进行Mill轮Miller-Rabin的除法。 If Miller-Rabin probabilistically indicates the number is probably a prime, then you do complete trial division or AKS or similar. 如果Miller-Rabin概率性地表明该数字可能是质数,则您要进行完全的除法或AKS或类似操作。 Miller Rabin can say "This is probably a prime" or "this is definitely composite". 米勒·拉宾(Miller Rabin)可以说“这可能是素数”或“这绝对是复合的”。 AKS gives a definitive answer, but it's slower. AKS给出了明确的答案,但是速度较慢。

FWIW, I've got a bunch of prime-related code collected together at http://stromberg.dnsalias.org/~dstromberg/primes/ FWIW,我在http://stromberg.dnsalias.org/~dstromberg/primes/上收集了很多与素数相关的代码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM