简体   繁体   English

在C中找到最接近的素数到无符号长整数(32位宽)的方法?

[英]A way to find the nearest prime number to an unsigned long integer ( 32 bits wide ) in C?

I'm looking for a way to find the closest prime number. 我正在寻找一种方法来找到最接近的素数。 Greater or less than, it doesn't matter, simply the closest ( without overflowing, preferably. ) As for speed, if it can compute it in approximately 50 milliseconds on a 1GHz machine ( in software, running inside Linux ), I'd be ecstatic. 大于或小于,无关紧要,只是最接近(没有溢出,最好是。)至于速度,如果它可以在1GHz机器上大约50毫秒计算它(在软件中,在Linux内运行),我会欣喜若狂。

The largest prime gap in the range up to (2^32 - 1) is (335). 在(2 ^ 32 - 1)范围内最大的主要差距是(335)。 There are (6542) primes less than (2^16) that can be tabulated and used to sieve successive odd values after a one-time setup. 有(6542)个小于(2 ^ 16)的质数可以制表并用于在一次性设置后筛选连续的奇数值。 Obviously, only primes <= floor(sqrt(candidate)) need be tested for a particular candidate value. 显然,只需要针对特定​​候选值测试素数<= floor(sqrt(候选者))。

Alternatively: The deterministic variant of the Miller-Rabin test , with SPRP bases: {2, 7, 61} is sufficient to prove primality for a 32-bit value. 或者: Miller-Rabin检验确定性变体 ,SPRP碱基:{2,7,61}足以证明32位值的原始性。 Due to the test's complexity (requires exponentiation, etc), I doubt it would be as fast for such small candidates. 由于测试的复杂性(需要取幂等),我怀疑它对于这样的小候选者来说会快得多。

Edit: Actually, if multiply/reduce can be kept to 32-bits in exponentiation (might need 64-bit support), the MR test might be better. 编辑:实际上,如果乘法/减少可以在求幂时保持32位(可能需要64位支持),MR测试可能会更好。 The prime gaps will typically be much smaller, making the sieve setup costs excessive. 主要间隙通常会小得多,使得筛网设置成本过高。 Without large lookup tables, etc., you might also get a boost from better cache locality. 如果没有大型查找表等,您可能还会从更好的缓存位置获得提升。

Furthermore: The product of primes {2, 3, 5, 7, 11, 13, 17, 19, 23} = (223092870). 此外:素数的乘积{2,3,5,7,11,13,17,19,23} =(223092870)。 Explicitly test any candidate in [2, 23]. 在[2,23]中明确地测试任何候选人。 Calculate greatest common divisor: g = gcd(u, 223092870UL) . 计算最大公约数: g = gcd(u, 223092870UL) If (g != 1) , the candidate is composite. 如果(g != 1) ,候选者是复合的。 If (g == 1 && u < (29 * 29)) , the candidate ( u > 23 ) is definitely prime. 如果(g == 1 && u < (29 * 29)) ,候选人( u > 23 )肯定是素数。 Otherwise, move on to the more expensive tests. 否则,继续进行更昂贵的测试。 A single gcd test using 32-bit arithmetic is very cheap, and according to Mertens' (?) theorem, this will detect ~ 68.4% of all odd composite numbers. 使用32位算术的单个gcd测试非常便宜,根据Mertens的(?)定理,这将检测所有奇数复合数的约68.4%。

UPDATE 2 : Fixed (in a heavy-handed way) some bugs that caused wrong answers for small n. 更新2 :修复(以沉重的方式)一些错误导致小n的错误答案。 Thanks to Brett Hale for noticing! 感谢Brett Hale的注意! Also added some asserts to document some assumptions. 还添加了一些断言来记录一些假设。

UPDATE : I coded this up and it seems plenty fast enough for your requirements (solved 1000 random instances from [2^29, 2^32-1] in <100ms, on a 2.2GHz machine -- not a rigorous test but convincing nonetheless). 更新 :我对此进行了编码,它看起来足够快,满足您的要求(在2.2GHz机器上,在<100ms内从[2 ^ 29,2 ^ 32-1]解决了1000个随机实例 - 不是一个严格的测试,但仍然令人信服)。

It is written in C++ since that's what my sieve code (which I adapted from) was in, but the conversion to C should be straightforward. 它是用C ++编写的,因为那是我的筛选代码(我改编自它),但是转换为C应该是直截了当的。 The memory usage is also (relatively) small which you can see by inspection. 内存使用量也相对较小,您可以通过检查看到。

You can see that because of the way the function is called, the number returned is the nearest prime that fits in 32 bits, but in fact this is the same thing since the primes around 2^32 are 4294967291 and 4294967311. 你可以看到,由于调用函数的方式,返回的数字是最接近32位的素数,但实际上这是相同的,因为2 ^ 32周围的素数是4294967291和4294967311。

I tried to make sure there wouldn't be any bugs due to integer overflow (since we're dealing with numbers right up to UINT_MAX); 我试图确保整数溢出不会有任何错误(因为我们正在处理数字直到UINT_MAX); hopefully I didn't make a mistake there. 希望我没有在那里犯错。 The code could be simplified if you wanted to use 64-bit types (or you knew your numbers would be smaller than 2^32-256) since you wouldn't have to worry about wrapping around in the loop conditions. 如果您想使用64位类型(或者您知道您的数字小于2 ^ 32-256),则可以简化代码,因为您不必担心在循环条件中回绕。 Also this idea scales for bigger numbers as long as you're willing to compute/store the small primes up to the needed limit. 此外,只要您愿意计算/存储小素数达到所需限制,这个想法就可以扩展到更大的数字。

I should note also that the small-prime-sieve runs quite quickly for these numbers (4-5 ms from a rough measurement) so if you are especially memory-starved, running it every time instead of storing the small primes is doable (you'd probably want to make the mark[] arrays more space efficient in this case) 我还要注意的是,小型筛子对于这些数字运行得非常快(粗略测量时间为4-5毫秒)所以如果你特别缺乏记忆力,那么每次运行它而不是存储小素数是可行的(你在这种情况下,我可能想让mark []数组更节省空间)

#include <iostream>
#include <cmath>
#include <climits>
#include <cassert>

using namespace std;

typedef unsigned int UI;

const UI MAX_SM_PRIME = 1 << 16;
const UI MAX_N_SM_PRIMES = 7000;
const UI WINDOW = 256;

void getSMPrimes(UI primes[]) {
  UI pos = 0;
  primes[pos++] = 2;

  bool mark[MAX_SM_PRIME / 2] = {false};
  UI V_SM_LIM = UI(sqrt(MAX_SM_PRIME / 2));
  for (UI i = 0, p = 3; i < MAX_SM_PRIME / 2; ++i, p += 2)
    if (!mark[i]) {
      primes[pos++] = p;
      if (i < V_SM_LIM)
        for (UI j = p*i + p + i; j < MAX_SM_PRIME/2; j += p)
          mark[j] = true;
      }
  }

UI primeNear(UI n, UI min, UI max, const UI primes[]) {
  bool mark[2*WINDOW + 1] = {false};

  if (min == 0) mark[0] = true;
  if (min <= 1) mark[1-min] = true;

  assert(min <= n);
  assert(n <= max);
  assert(max-min <= 2*WINDOW);

  UI maxP = UI(sqrt(max));
  for (int i = 0; primes[i] <= maxP; ++i) {
    UI p = primes[i], k = min / p;
    if (k < p) k = p;
    UI mult = p*k;
    if (min <= mult)
      mark[mult-min] = true;
    while (mult <= max-p) {
      mult += p;
      mark[mult-min] = true;
      }
    }

  for (UI s = 0; (s <= n-min) || (s <= max-n); ++s)
    if ((s <= n-min) && !mark[n-s-min])
      return n-s;
    else if ((s <= max-n) && !mark[n+s-min])
      return n+s;

  return 0;
  }

int main() {
  UI primes[MAX_N_SM_PRIMES];
  getSMPrimes(primes);

  UI n;
  while (cin >> n) {
    UI win_min = (n >= WINDOW) ? (n-WINDOW) : 0;
    UI win_max = (n <= UINT_MAX-WINDOW) ? (n+WINDOW) : UINT_MAX;

    if (!win_min)
      win_max = 2*WINDOW;
    else if (win_max == UINT_MAX)
      win_min = win_max-2*WINDOW;

    UI p = primeNear(n, win_min, win_max, primes);
    cout << "found nearby prime " << p << " from window " << win_min << ' ' << win_max << '\n';
    }
  }

You can sieve intervals in that range if you know primes up to 2^16 (there are only 6542 <= 2^16; you should go a bit higher if the prime itself could be greater than 2^32 - 1). 如果你知道质数高达2 ^ 16(只有6542 <= 2 ^ 16;如果素数本身可能大于2 ^ 32 - 1,你应该更高一些),你可以筛选该范围内的间隔。 Not necessarily the fastest way but very simple, and fancier prime testing techniques are really suited to much larger ranges. 不一定是最快捷的方式,但非常简单,而且更加绚丽的主要测试技术非常适合更大的范围。

Basically, do a regular Sieve of Eratosthenes to get the "small" primes (say the first 7000). 基本上,做一个常规的Eratosthenes筛子来获得“小”素数(比如第7000个)。 Obviously you only need to do this once at the start of the program, but it should be very fast. 显然你只需要在程序开始时执行一次,但它应该非常快。

Then, supposing your "target" number is 'a', consider the interval [an/2, a+n/2) for some value of n. 然后,假设您的“目标”数字是'a',请考虑某个n值的区间[an / 2,a + n / 2)。 Probably n = 128 is a reasonable place to start; 可能n = 128是一个合理的起点; you may need to try adjacent intervals if the numbers in the first one are all composite. 如果第一个中的数字都是复合的,您可能需要尝试相邻的间隔。

For every "small" prime p, cross out its multiples in the range, using division to find where to start. 对于每个“小”素数p,在该范围内划掉它的倍数,使用除法找到从哪里开始。 One optimization is that you only need to start crossing off multiples starting at p*p (which means that you can stop considering primes once p*p is above the interval). 一个优化是你只需要从p * p开始交叉掉多次(这意味着一旦p * p高于区间,你就可以停止考虑质数)。

Most of the primes except the first few will have either one or zero multiples inside the interval; 除前几个之外的大部分素数在区间内都有一个或零个倍数; to take advantage of this you can pre-ignore multiples of the first few primes. 利用这个你可以预先忽略前几个素数的倍数。 The simplest thing is to ignore all even numbers, but it's not uncommon to ignore multiples of 2, 3, and 5; 最简单的方法是忽略所有偶数,但忽略2,3和5的倍数并不罕见; this leaves integers congruent to 1, 7, 11, 13, 17, 19, 23, and 29 mod 30 (there are eight, which map nicely to the bits of a byte when sieving a large range). 这使整数与1,7,11,13,17,19,23和29 mod 30一致(有8个,当筛选大范围时,它很好地映射到一个字节的位)。

...Sort of went off on a tangent there; ......有点像切线那样; anyway once you've processed all the small primes (up till p*p > a+n/2) you just look in the interval for numbers you didn't cross out; 无论如何,一旦你处理了所有的小素数(直到p * p> a + n / 2),你只需查看你没有交叉的数字的间隔; since you want the closest to a start looking there and search outward in both directions. 因为你想要最接近开始,然后向两个方向向外搜索。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM