简体   繁体   English

基数字符串排序数组

[英]radix sort array of strings

I am trying to use radix sort to sort file contain social security and date of birth the format looks like this "###-##-####,#######.I have to apply radix sort on each fields according to command line switch. I have a radix sort that is work for int array and i am trying to modify the code for string type array but i am not sure how to accomplish this. I did a quick sort for string type by comparing strings and pivot and that is work fine however for radix sort I am not if I can do this with string type or I have to convert the string to integer. I have tried to use "atoi" to convert to integer but I am not sure how to correctly do this if I have to. 我正在尝试使用基数排序对包含社会保障和出生日期的文件进行排序,格式如下:“ ###-##-####,#######。我必须对基数进行排序每个字段根据命令行开关。我有一个基数排序适用于int数组,并且我正在尝试修改字符串类型数组的代码,但我不确定如何完成此操作。比较字符串和枢轴,这很好,但是对于基数排序我不是,如果我可以用字符串类型做到这一点,或者我必须将字符串转换为整数。我试图使用“ atoi”将其转换为整数,但我不是如果需要,请确定如何正确执行此操作。

   string getMax(string arr[], int n){
    string max = arr[0];
    for (int i = 1; i < n; i++){
        if (arr[i]>max)
            max = arr[i];
    }
    return max;
   }

   void countSort(string a[], int size, int k){
    string *b = NULL; int *c = NULL;
    b = new string[size];
    c = new int[k]; 



    for (int i = 0; i <k; i++){
        c[i] = 0;
        //cout << c[i] << "\n";
    }
    for (int j = 0; j <size; j++){   
        c[(a[j]/k)%10]++;            //a[j] is a string
        //cout << c[a[j]] << endl;
    }

    for (int f = 1; f <10; f++){
        c[f] += c[f - 1];
    }

    for (int r = size - 1; r >= 0; r--){
        b[c[(a[r] / k) % 10] - 1] = a[r];
        c[(a[r] / k) % 10]--;
    }

    for (int l = 0; l < size; l++){
        a[l] = b[l];
    }

    }


    void radixSort(string b[], int r){
    string max = getMax(b, r);
    for (int digit = 1; max / digit > 0; digit *= 10){
        countSort(b, r, digit);
    }

    };  

I didn't try, but I think you can do radix sort for string. 我没有尝试,但是我认为您可以对字符串进行基数排序。

  1. Calculate the length of the longest string in the array to sort. 计算要排序的数组中最长字符串的长度。
  2. Do radix sort just like for integers. 像整数一样进行基数排序。 Do sorting using each characters in the string. 使用字符串中的每个字符进行排序。 If a string is shorter than another and there is no character in the "digit", consider its value as -65536 (or a smaller value than any other characters). 如果一个字符串比另一个短,并且“数字”中没有字符,则将其值视为-65536 (或比任何其他字符都小的值)。

UPDATE: I tested my idea and it seems working. 更新:我测试了我的想法,似乎可行。

#include <cstdio>
#include <string>
using std::string;

size_t getMax(string arr[], int n){
    size_t max = arr[0].size();
    for (int i = 1; i < n; i++){
        if (arr[i].size()>max)
            max = arr[i].size();
    }
    return max;
}

void countSort(string a[], int size, size_t k){
    string *b = NULL; int *c = NULL;
    b = new string[size];
    c = new int[257];



    for (int i = 0; i <257; i++){
        c[i] = 0;
        //cout << c[i] << "\n";
    }
    for (int j = 0; j <size; j++){   
        c[k < a[j].size() ? (int)(unsigned char)a[j][k] + 1 : 0]++;            //a[j] is a string
        //cout << c[a[j]] << endl;
    }

    for (int f = 1; f <257; f++){
        c[f] += c[f - 1];
    }

    for (int r = size - 1; r >= 0; r--){
        b[c[k < a[r].size() ? (int)(unsigned char)a[r][k] + 1 : 0] - 1] = a[r];
        c[k < a[r].size() ? (int)(unsigned char)a[r][k] + 1 : 0]--;
    }

    for (int l = 0; l < size; l++){
        a[l] = b[l];
    }

    // avold memory leak
    delete[] b;
    delete[] c;
}


void radixSort(string b[], int r){
    size_t max = getMax(b, r);
    for (size_t digit = max; digit > 0; digit--){ // size_t is unsigned, so avoid using digit >= 0, which is always true
        countSort(b, r, digit - 1);
    }

}

int main(void) {
    string data[] = {
        "aaaba",
        "dfjasdlifjai",
        "jiifjeogiejogp",
        "aabaaaa",
        "gsgj",
        "gerph",
        "aaaaaaa",
        "htjltjlrth",
        "joasdjfisdjfdo",
        "hthe",
        "aaaaaba",
        "jrykpjl",
        "hkoptjltp",
        "aaaaaa",
        "lprrjt"
    };
    puts("before sorting:");
    for (size_t i = 0; i < sizeof(data) / sizeof(data[0]); i++) {
        printf("    %s\n", data[i].c_str());
    }
    radixSort(data, (int)(sizeof(data) / sizeof(data[0])));
    puts("after sorting:");
    for (size_t i = 0; i < sizeof(data) / sizeof(data[0]); i++) {
        printf("    %s\n", data[i].c_str());
    }
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM