简体   繁体   English

同时使用 rand() 和 rand_r():这个简单的例子正确吗?

[英]Using both rand() and rand_r() : is this simple example correct?

I am trying to understand the correct usage of parallel random number generation.我试图了解并行随机数生成的正确用法。 After having consulted different resources, I wrote a simple code that seems to work, but it would be nice if someone could confirm my understanding.在查阅了不同的资源之后,我编写了一个似乎可以工作的简单代码,但如果有人能证实我的理解,那就太好了。

For the sake of pointing out the difference and relationship between rand() and rand_r(), let's solve:为了指出 rand() 和 rand_r() 的区别和关系,我们来解决:

Produce a random integer N, then extract N random numbers in parallel and compute their average.产生一个随机的 integer N,然后并行提取 N 个随机数并计算它们的平均值。

This is my proposal (checking and free omitted), small integers on purpose:这是我的建议(检查和免费省略),故意小整数:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>

int main() {
        /* Initialize and extract an integer via rand() */
        srand(time(NULL));
        int N = rand() % 100;

        /* Storage array */ 
        int *extracted = malloc(sizeof(int) * N);

        /* Initialize N seeds for rand_r, which is completely
         * independent on rand and srand().
         * (QUESTION 1: is it right?)
         * Setting the first as time(NULL), and the others
         * via successive increasing is a good idea (? QUESTION 2)*/
        unsigned int *my_seeds = malloc(sizeof(unsigned int) * N);
        my_seeds[0] = time(NULL);
        for (int i = 1; i < N; ++i) {
                my_seeds[i] = my_seeds[i - 1] + 1;
        }

        /* The seeds for rand_r are ready:
         * extract N random numbers in parallel */
        #pragma omp parallel for
        for (int i = 0; i < N; ++i) {
                extracted[i] = rand_r(my_seeds + i) % 10;
        }

        /* Compute the average: must be done sequentially, QUESTION 3,
         * because of time-sincronization in reading/writing avg */
        double avg = 0;
        for (int i = 0; i < N; ++i) {
                avg += extracted[i];
        }
        avg /= N;
        printf("%d samples, %.2f in average.\n", N, avg);
        return 0;
}

As my comments in the code try to highlight, it would be helpful to understand if:正如我在代码中的注释试图强调的那样,了解以下情况会有所帮助:

  1. the simultaneous usage of rand and rand_r is in this case correct;在这种情况下,同时使用 rand 和 rand_r 是正确的;

  2. the seed's initialization for rand_r, ie the variable my_seeds, is fine; rand_r 的种子初始化,即变量 my_seeds,没问题;

  3. the for parallelization and related variable usage is safe.用于并行化和相关变量的使用是安全的。

I hope to sum up various doubts in a single, simple, ready-to-use example, after having read various tutorials / sources online (this website included).在阅读了各种在线教程/资源(包括本网站)之后,我希望通过一个简单、易于使用的示例来总结各种疑问。

  1. There is nothing incorrect about using both, as long as rand is not called concurrently.只要不同时调用rand ,两者都使用没有什么不正确的。

  2. It's unclear what you consider as "fine" or "a good idea".目前尚不清楚您认为什么是“好”或“好主意”。 It's fine in the sense that you will get different random number sequences produced for each seed.从某种意义上说,您将为每个种子生成不同的随机数序列,这很好。 It's a bit nonsensical in that you only generate a single random number from each seed (which means the generated numbers will all likely follow a very predictable pattern, as do your seeds).这有点荒谬,因为您只从每个种子生成一个随机数(这意味着生成的数字都可能遵循一个非常可预测的模式,就像您的种子一样)。

  3. There are no race conditions, so it is safe.没有竞争条件,所以它是安全的。 Parallelization for < 100 calls of a (presumably) simple arithmetic method is not going to be worth it from a performance perspective, but that's not what you're asking about.从性能的角度来看,< 100 次调用(大概)简单算术方法的并行化是不值得的,但这不是你要问的。

All in all, this code has no formal correctness problems.总而言之,这段代码没有形式上的正确性问题。 Whether it fulfills whatever purpose you would like it to fulfill is a different question.它是否满足您希望它实现的任何目的是一个不同的问题。 Take note that rand (and rand_r ) tend to be only very superficially random 1 , so the predictability mentioned in point 2 is just more of the same.请注意, rand (和rand_r )往往只是表面上的随机1 ,因此第 2 点中提到的可预测性几乎相同。 See also Why is rand()%6 biased?另请参阅为什么 rand()%6 有偏差? for yet another quality-of-randomness issue in the code.代码中的另一个随机质量问题。 In other words, be aware that the randomness you are producing here is lacking for many applications.换句话说,请注意,您在此处生成的随机性对于许多应用程序来说是缺乏的。

1 Assuming that unsigned int has 32 bits, there are only 32 bits of state for the PRNG, so it will repeat after (at most) 2 32 calls anyway (which is trivial to brute-force). 1假设unsigned int有 32 位,那么 PRNG 的 state 只有 32 位,因此它会在(最多)2 32 次调用之后重复(这对于蛮力来说是微不足道的)。

  1. the simultaneous usage of rand and rand_r is in this case correct;在这种情况下,同时使用 rand 和 rand_r 是正确的;

As long as:只要:

  • rand is not used concurrently (which in your code is ok - you're only calling it once in the main thread) rand不是同时使用的(在你的代码中是好的 - 你只在主线程中调用它一次)
  • rand_r with the same seed variable is not used concurrently (which in your code is ok - you're only calling it once for each seed variable)具有相同种子变量的rand_r不会同时使用(在您的代码中是可以的 - 您只为每个种子变量调用一次)

there are no issues with thread safety.线程安全没有问题。

  1. the seed's initialization for rand_r, ie the variable my_seeds, is fine; rand_r 的种子初始化,即变量 my_seeds,没问题;

You have a separate seed for every (potentially) concurrent use of rand_r .对于rand_r的每次(可能)并发使用,您都有一个单独的种子。 As long as the same seed variable isn't used for concurrent calls to rand_r (which in your code doesn't happen), all is good.只要不将相同的种子变量用于对rand_r的并发调用(在您的代码中不会发生),一切都很好。

  1. the for parallelization and related variable usage is safe.用于并行化和相关变量的使用是安全的。

Each "thread" in your code has its own seed variable for rand_r and its own result variable.代码中的每个“线程”都有自己的rand_r种子变量和自己的结果变量。 So there's no concurrency issue wrt.所以没有并发问题。 that.那。

Side note: rand_r has been obsoleted, and both rand and rand_r are relatively low quality prng's .旁注: rand_r已经过时, randrand_r都是质量相对较低的prng。 Depending on your needs, it might be worth it to investigate alternative prng's.根据您的需要,研究替代 prng 可能是值得的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM