简体   繁体   English

使用OpenMP在C中并行化基数排序

[英]Parallelization of radix-sort in C with OpenMP

How would you go about parallelizing the radix-sort algorithm in C with OpenMP? 您如何在C语言中使用OpenMP并行化基数排序算法?

My program is a modification of your typical radix-sort: It sorts an array of integers based on the binary representation of the digit, where you're able to vary the number of bits that should be interpreted as one digit (which essentially will be used to get different running time based on how big your integers are). 我的程序是对典型基数排序的修改:它根据数字的二进制表示对整数数组进行排序,在这里你可以改变应该被解释为一位数的位数(基本上是用于根据整数的大小来获得不同的运行时间。

I have a radix-function that takes three arguments: 我有一个基数函数,它有三个参数:

// n is the number of elements in data
// b is number of bits that should be interpreted as one digit
void radix(int* data, int n, int b);

Further on, my radix-function iterates through all the bits (int: 32 bits) with b increments: 此外,我的基数函数以b增量迭代所有位(int:32位):

for(bit = 0; bit < 32; bit += b) { ... }

Which contains three parts: 其中包含三个部分:

  • Counting occurences of a certain digit (actually bits), to decide how much storage a bucket needs. 计算某个数字(实际为位)的出现次数,以确定存储桶需要多少存储空间。 bucket[(data[i] >> bit) & (int)(pow(2,b)-1)]++
  • Putting values in a temporary array (the buckets). 将值放在临时数组(存储桶)中。

    bitval = (data[i] >> bit) & (int)(pow(2,b)-1)

    temp_data[bucket[bitval]++] = data[i]

  • Copying values from the temporary buckets to the *data pointer given to the function. 将临时存储区中的值复制到给予该函数的*data指针。

    for(i = 0; i < n; i++) { data[i] = temp_data[i] }

Parallelization is going to be an issue because the limiting factor will be memory bandwidth (there is very little CPU overhead, and only one memory bus). 并行化将成为一个问题,因为限制因素将是内存带宽(CPU开销非常小,只有一个内存总线)。

Also instead of using the floating point function pow(2,b), create a bit mask and right shift count based on b: 而不是使用浮点函数pow(2,b),基于b创建位掩码和右移计数:

    numberOfBits = b;
    shiftCount = 0;
    while(1){  // main loop
        // set numberOfBuckets
        numberOfBuckets = 1 << numberOfBits;
        bitMask = numberOfBuckets - 1;
        // code to generate histogram for this field goes here
        // ...
        shiftCount += numberOfBits;
        // check for partial bit field
        if((shiftCount + numberOfBits) > (8*sizeof(unsigned int))){
            numberOfBits = (8*sizeof(unsigned int)) - shiftCount;
            shiftCount = (8*sizeof(unsigned int)) - numberOfBits;
            continue; // do partial bit field
        }
        // check for done
        if(shiftCount == (8*sizeof(unsigned int)))
            break; // done
    }

If sorting signed integers, you'll need to adjust for the most significant field (also arithmetic right shift for signed integers is compiler / platform dependent). 如果对有符号整数进行排序,则需要调整最重要的字段(对于有符号整数,算术右移也取决于编译器/平台)。 One solution (for two's complement signed integers) is to cast to unsigned integer and complement the sign bit for bucket index generation. 一个解决方案(对于二进制补码有符号整数)是转换为无符号整数,并补充用于存储桶索引生成的符号位。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM