简体   繁体   English

Eratosthenes筛用于大量C ++

[英]Sieve of Eratosthenes for large numbers c++

Just like this question , I also am working on the sieve of Eratosthenes. 就像这个问题一样 ,我也在研究Eratosthenes的筛子。 Also from the book "programming principles and practice using c++" , chapter 4. I was able to implement it correctly and it is functioning exactly as the exercise asks. 同样从《使用c ++进行编程的原理和实践》一书的第4章中。我能够正确地实现它,并且它可以按照练习要求正确地起作用。

#include <iostream>
#include <vector>

using namespace std;

int main() {
    unsigned int amount = 0;

    cin >> amount;

    vector<int>numbers;

    for (unsigned int i = 0; i <= amount; i++) {
        numbers.push_back(i);
    }

    for (unsigned int p = 2; p < amount; p++) {
        if (numbers[p] == 0)
            continue;

        cout << p << '\n';

        for (unsigned int i = p + p; i <= amount; i += p) {
            numbers[i] = false;
        }
    }

    return 0;
}

Now, how would I be able to handle real big numbers in the amount input? 现在,我将如何处理输入amount中的实数? The unsigned int type should allow me to enter a number of 2^32=4,294,967,296. unsigned int类型应允许我输入2 ^ 32 = 4,294,967,296的数字。 But I can't, I run out of memory. 但是我不能,我用光了内存。 Yes, I've done the math: storing 2^32 amount of int, 32 bits each. 是的,我已经完成了数学运算:存储2 ^ 32数量的int,每个整数32位。 So 32/8*2^32=16 GiB of memory. 因此,32/8 * 2 ^ 32 = 16 GiB的内存。 I have just 4 GiB... 我只有4个GiB ...

So what I am really doing here is setting non-primes to zero. 所以我在这里真正要做的是将非素数设置为零。 So I could use a boolean. 所以我可以使用布尔值。 But still, they would take 8 bits, so 1 byte each. 但是,它们将占用8位,因此每个占用1个字节。 Theoretical I could go to the limit for unsigned int (8/8*2^32=4 GiB), using some of my swap space for the OS and overhead. 理论上,我可以将一些交换空间用于操作系统和开销,从而达到unsigned int (8/8 * 2 ^ 32 = 4 GiB)的极限。 But I have a x86_64 PC, so what about numbers larger than 2^32? 但是我有一台x86_64 PC,那么大于2 ^ 32的数字呢?

Knowing that primes are important in cryptography , there must be a more efficient way of doing this? 知道素数在密码学中重要 ,因此必须有一种更有效的方法吗? And are there also ways to optimize the time needed to find all those primes? 还有其他方法可以优化找到所有这些素数所需的时间吗?

In the sense of storage, you could use the std::vector<bool> container. 在存储方面,您可以使用std::vector<bool>容器。 Because of how it works, you have to trade in speed for storage. 由于它的工作方式,您必须牺牲存储速度 Because this implements one bit per boolean, your storage becomes 8 times as efficient. 因为此实现每个布尔值一位,所以存储效率提高了8倍。 You should be possible to get numbers close to 8*4,294,967,296 if you have all your RAM available for this one program. 如果您有足够的RAM用于该程序,则应该有可能获得接近8 * 4,294,967,296的数字。 Only thing you need to do, is use unsigned long long to unleash the availability of 64 bit numbers. 您唯一需要做的就是使用unsigned long long释放64位数字的可用性。

Note : Testing the program with the code example below, with an amount input of 8 billion, caused the program to run with a memory usage of approx. 注意 :使用下面的代码示例测试该程序,输入的金额为80亿,导致该程序在运行时的内存使用量大约为80。 975 MiB, proving the theoretical number. 975 MiB,证明了理论值。

You can also gain some time, because you can declare the complete vector at once, without iteration: vector<bool>numbers (amount, true); 您还可以获得一些时间,因为您可以一次声明完整的向量,而无需迭代: vector<bool>numbers (amount, true); creates a vector of size equal to input amount , with all elements set to true. 创建一个大小等于输入的向量,并将所有元素设置为true。 Now, you can adjust the code to set non-primes to false instead of 0. 现在,您可以调整代码以将非素数设置为false而不是0。

Furthermore, once you have followed the sieve up to the square root of amount , all numbers that remain true are primes. 此外,一旦遵循筛子直到数量的平方根,所有保持为真的数字都是质数。 Insert if (p * p >= amount) as an additional continue condition, just after you output the prime number. 在输出素数之后,插入if (p * p >= amount)作为其他继续条件。 Also this is a humble improvement for your processing time. 同样,这对于您的处理时间来说是微不足道的改进。

Edit : In the last loop, p can be squared, because all numbers until the square of p are already proved not to be primes by previous numbers. 编辑 :在最后一个循环中,可以p求平方,因为直到p的平方为止的所有数字都已被先前的数字证明不是素数。

All together you should get something like this: 总之,您应该得到如下内容:

#include <iostream>
#include <vector>

using namespace std;

int main() {
    unsigned long long amount = 0;

    cin >> amount;

    vector<bool>numbers (amount, true);

    for (unsigned long long p = 2; p < amount; p++) {
        if ( ! numbers[p])
            continue;

        cout << p << '\n';

        if (p * p >= amount)
            continue;

        for (unsigned long long i = p * p; i <= amount; i += p) {
            numbers[i] = false;
        }
    }

    return 0;
}

You've asked a couple of different questions. 您问了几个不同的问题。

For primes up to 2**32, sieving is appropriate, but you need to work in segments instead of in one big blog. 对于不超过2 ** 32的素数,筛分是适当的,但是您需要分段操作而不是在一个大博客中工作。 My answer here tells how to do that. 在这里的答案告诉您如何执行此操作。

For cryptographic primes, which are very much larger, the process is to pick a number and then test it for primality, using a probabilistic test such as a Miller-Rabin test or a Baillie-Wagstaff test. 对于非常大的密码素数,过程是选择一个数字,然后使用诸如Miller-Rabin检验或Baillie-Wagstaff检验之类的概率检验来测试其素数。 This process isn't perfect, and occasionally a composite might be chosen instead of a prime, but such an occurrence is very rare. 这个过程并不完美,有时可能会选择复合词而不是素词,但是这种情况很少见。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM