简体   繁体   English

在MPI C中为每个进程生成随机数

[英]Random number generated to each process in MPI C

I am trying to generate random numbers and assign these numbers to the array in every task. 我试图生成随机数,并在每个任务中将这些数字分配给数组。 I want to make sure the random numbers in different tasks are different. 我想确保不同任务中的随机数不同。 How could I achieve that? 我怎么能实现这一目标?

If every MPI task initialize its own array with random numbers (like the code I attached), are these numbers different among task? 如果每个MPI任务使用随机数(如我附加的代码)初始化自己的数组,那么这些数字在任务中是不同的吗?

I know I could generate a large set of random numbers and broadcast to each task, but this may cause memory issues for large arrays. 我知道我可以生成一大组随机数并广播到每个任务,但这可能会导致大型数组的内存问题。

Thank you so much in advance for any informations. 非常感谢您提供任何信息。

void initialize(float* inarray, int n){
    int i;
    for (i=0; i<n; i++){
            inarray[i] = random() / (float)RAND_MAX;
        }
    }
}

void main(int argc, char* argv[]){

    MPI_Comm comm=MPI_COMM_WORLD;
    int numnodes, myid, ierr;
    ierr=MPI_Init(&argc, &argv);
    ierr=MPI_Comm_size(comm, &numnodes);
    ierr=MPI_Comm_rank(comm, &myid);

    int n = 100;
    float *x = malloc(sizeof(float)*n);
    initialize(x, n);

    ierr=MPI_Finalize();
}

Here is my stupid solution: In a loop, one MPI task create random number and send to other tasks sequentially. 这是我的愚蠢解决方案:在循环中,一个MPI任务创建随机数并按顺序发送到其他任务。 To me, this is like a manual broadcast but saves a little memory. 对我来说,这就像手动播放,但节省了一点点内存。

    if (myid == 0){
        int i;
        for (i=0; i<n; i++){
            initialize(x, n);
            if (i != myid){
                MPI_Send(&x[0], n, MPI_FLOAT, i, 0, comm);
            }
        }
    }else{
        MPI_Recv(&x[0], n, MPI_FLOAT, 0, 0, comm, MPI_STATUS_IGNORE);
    }

The (pseudo) random number generator has to be seeded, and with a different seed on each rank. (伪)随机数发生器必须被播种,并且每个等级上有不同的种子。

In my environment (CentOS 7) : 在我的环境中(CentOS 7):

#include <stdio.h>
#include <stdlib.h>

#include <mpi.h>

int main(int argc, char *argv[]) {
    int rank;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    srand(rank+1);
    float f = (float)random() / (float)RAND_MAX;
    printf ("%d: %f\n", rank, f);
    MPI_Finalize();
    return 0;
}

produces 产生

0: 0.840188
1: 0.700976
2: 0.561380
3: 0.916458
4: 0.274746
5: 0.135439
6: 0.486904
7: 0.352761
8: 0.206965
9: 0.565811
10: 0.926345
11: 0.785600
12: 0.632643
13: 0.999498
14: 0.354973
15: 0.215437

So this in fact this is more complex than expected. 事实上,这比预期的要复杂得多。 There is two issue: 有两个问题:

First, let say you have 40 mpi processes which all need to draw 10 number, using an uniform distribution between 0 and 1. Then if you take draw the 400 numbers using a single chain, you will get a quite good uniform distribution. 首先,假设您有40个mpi进程,所有进程都需要绘制10个数字,使用0和1之间的均匀分布。然后,如果您使用单个链条绘制400个数字,您将获得非常好的均匀分布。 But if you draw only 10 number you probably will not. 但如果你只抽出10个数字,你可能不会。 Drawing 10 number out of 40 chain is not strictly equivalent to drawing 400 number from one chain. 从40链中拉出10个数字并不严格等同于从一个链中绘制400个数字。 But this issue will pop only if you need to have a good control of your distribution. 但是,只有当您需要对分发进行良好控制时,才会出现此问题。

Secondly, let say you want to study a given problem numerically. 其次,假设您希望以数字方式研究给定问题。 On thing that you have to test is the convergence of your solver. 你需要测试的是你的求解器的收敛。 To do so, you need to see if your solver behave well under refinement of resolution. 为此,您需要查看求解器在精细分辨率下是否表现良好。 So what you need is to generate different realization (your random numbers) in a way that you keep the same low frequencies (if you think in Fourier space). 所以你需要的是以一种保持相同低频的方式产生不同的实现(你的随机数)(如果你想在傅立叶空间中)。 To do so you need to be sure that the random number which is at a given place (let say the 327 number of your 400) is always the same, regardless of the amount of processes you use. 要做到这一点,你需要确保在给定位置的随机数(比如400的327号)始终是相同的,无论你使用多少进程。

Both of those issue shows that simply drawing consecutive number from a chain is not a good solution. 这两个问题都表明,从链中简单地绘制连续数字并不是一个好的解决方案。

So what you can do is: -use a single chain - 1st process use the first 10 number of the chain. 所以你可以做的是: - 使用一个链 - 第一个过程使用链的前10个数。 - 2nd process use the 11st to 20th number of the chain. - 第二个过程使用链的第11到第20个数字。 ...... ......

But this means that a given process have to discard a given number of element of the chain. 但这意味着给定的过程必须丢弃链中给定数量的元素。 If you use "usual" random number generator, discarding n number is a O(n) operation. 如果使用“通常”随机数生成器,则丢弃n数是O(n)运算。 Which means that you cant parallelize your problem. 这意味着您无法并行化您的问题。 What you need is to use a random generator which contains a method to discard n draw in O(1). 你需要的是使用一个随机生成器,它包含一个方法来丢弃O(1)中的n个绘制。 There is plenty of such method you can find on the web. 你可以在网上找到很多这样的方法。 Or you can code it by yourself (but this is really trying to reinvent the wheel). 或者你可以自己编码(但这真的是试图重新发明轮子)。

Myself I use: https://rdrr.io/cran/sitmo/ 我自己使用: https//rdrr.io/cran/sitmo/
But it is for C++, but sure you might be able to find version for C or ompile just this part with C++... 但是它适用于C ++,但是你可以确定你可以找到C版或者只使用C ++实现这一部分的版本......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM