Eratosthenes筛用于大量C ++

Question

Just like this question , I also am working on the sieve of Eratosthenes. 就像这个问题一样，我也在研究Eratosthenes的筛子。 Also from the book "programming principles and practice using c++" , chapter 4. I was able to implement it correctly and it is functioning exactly as the exercise asks. 同样从《使用c ++进行编程的原理和实践》一书的第4章中。我能够正确地实现它，并且它可以按照练习要求正确地起作用。

#include <iostream>
#include <vector>

using namespace std;

int main() {
    unsigned int amount = 0;

    cin >> amount;

    vector<int>numbers;

    for (unsigned int i = 0; i <= amount; i++) {
        numbers.push_back(i);
    }

    for (unsigned int p = 2; p < amount; p++) {
        if (numbers[p] == 0)
            continue;

        cout << p << '\n';

        for (unsigned int i = p + p; i <= amount; i += p) {
            numbers[i] = false;
        }
    }

    return 0;
}

Now, how would I be able to handle real big numbers in the amount input? 现在，我将如何处理输入amount中的实数？ The unsigned int type should allow me to enter a number of 2^32=4,294,967,296. unsigned int类型应允许我输入2 ^ 32 = 4,294,967,296的数字。 But I can't, I run out of memory. 但是我不能，我用光了内存。 Yes, I've done the math: storing 2^32 amount of int, 32 bits each. 是的，我已经完成了数学运算：存储2 ^ 32数量的int，每个整数32位。 So 32/8*2^32=16 GiB of memory. 因此，32/8 * 2 ^ 32 = 16 GiB的内存。 I have just 4 GiB... 我只有4个GiB ...

So what I am really doing here is setting non-primes to zero. 所以我在这里真正要做的是将非素数设置为零。 So I could use a boolean. 所以我可以使用布尔值。 But still, they would take 8 bits, so 1 byte each. 但是，它们将占用8位，因此每个占用1个字节。 Theoretical I could go to the limit for unsigned int (8/8*2^32=4 GiB), using some of my swap space for the OS and overhead. 理论上，我可以将一些交换空间用于操作系统和开销，从而达到unsigned int （8/8 * 2 ^ 32 = 4 GiB）的极限。 But I have a x86_64 PC, so what about numbers larger than 2^32? 但是我有一台x86_64 PC，那么大于2 ^ 32的数字呢？

Knowing that primes are important in cryptography , there must be a more efficient way of doing this? 知道素数在密码学中很重要，因此必须有一种更有效的方法吗？ And are there also ways to optimize the time needed to find all those primes? 还有其他方法可以优化找到所有这些素数所需的时间吗？

Answer 1

In the sense of storage, you could use the std::vector<bool> container. 在存储方面，您可以使用std::vector<bool>容器。 Because of how it works, you have to trade in speed for storage. 由于它的工作方式，您必须牺牲存储速度。 Because this implements one bit per boolean, your storage becomes 8 times as efficient. 因为此实现每个布尔值一位，所以存储效率提高了8倍。 You should be possible to get numbers close to 8*4,294,967,296 if you have all your RAM available for this one program. 如果您有足够的RAM用于该程序，则应该有可能获得接近8 * 4,294,967,296的数字。 Only thing you need to do, is use unsigned long long to unleash the availability of 64 bit numbers. 您唯一需要做的就是使用unsigned long long释放64位数字的可用性。

Note : Testing the program with the code example below, with an amount input of 8 billion, caused the program to run with a memory usage of approx. 注意：使用下面的代码示例测试该程序，输入的金额为80亿，导致该程序在运行时的内存使用量大约为80。 975 MiB, proving the theoretical number. 975 MiB，证明了理论值。

You can also gain some time, because you can declare the complete vector at once, without iteration: vector<bool>numbers (amount, true); 您还可以获得一些时间，因为您可以一次声明完整的向量，而无需迭代： vector<bool>numbers (amount, true); creates a vector of size equal to input amount , with all elements set to true. 创建一个大小等于输入量的向量，并将所有元素设置为true。 Now, you can adjust the code to set non-primes to false instead of 0. 现在，您可以调整代码以将非素数设置为false而不是0。

Furthermore, once you have followed the sieve up to the square root of amount , all numbers that remain true are primes. 此外，一旦遵循筛子直到数量的平方根，所有保持为真的数字都是质数。 Insert if (p * p >= amount) as an additional continue condition, just after you output the prime number. 在输出素数之后，插入if (p * p >= amount)作为其他继续条件。 Also this is a humble improvement for your processing time. 同样，这对于您的处理时间来说是微不足道的改进。

Edit : In the last loop, p can be squared, because all numbers until the square of p are already proved not to be primes by previous numbers. 编辑：在最后一个循环中，可以p求平方，因为直到p的平方为止的所有数字都已被先前的数字证明不是素数。

All together you should get something like this: 总之，您应该得到如下内容：

#include <iostream>
#include <vector>

using namespace std;

int main() {
    unsigned long long amount = 0;

    cin >> amount;

    vector<bool>numbers (amount, true);

    for (unsigned long long p = 2; p < amount; p++) {
        if ( ! numbers[p])
            continue;

        cout << p << '\n';

        if (p * p >= amount)
            continue;

        for (unsigned long long i = p * p; i <= amount; i += p) {
            numbers[i] = false;
        }
    }

    return 0;
}

Answer 2

You've asked a couple of different questions. 您问了几个不同的问题。

For primes up to 2**32, sieving is appropriate, but you need to work in segments instead of in one big blog. 对于不超过2 ** 32的素数，筛分是适当的，但是您需要分段操作而不是在一个大博客中工作。 My answer here tells how to do that. 我在这里的答案告诉您如何执行此操作。

For cryptographic primes, which are very much larger, the process is to pick a number and then test it for primality, using a probabilistic test such as a Miller-Rabin test or a Baillie-Wagstaff test. 对于非常大的密码素数，过程是选择一个数字，然后使用诸如Miller-Rabin检验或Baillie-Wagstaff检验之类的概率检验来测试其素数。 This process isn't perfect, and occasionally a composite might be chosen instead of a prime, but such an occurrence is very rare. 这个过程并不完美，有时可能会选择复合词而不是素词，但是这种情况很少见。

Eratosthenes筛用于大量C ++

问题描述

2 个解决方案

解决方案1
6 2015-01-20 19:30:37

解决方案2
1 2015-01-20 20:01:32

Eratosthenes筛用于大量C ++

问题描述

2 个解决方案

解决方案1 6 2015-01-20 19:30:37

解决方案2 1 2015-01-20 20:01:32

解决方案1
6 2015-01-20 19:30:37

解决方案2
1 2015-01-20 20:01:32