简体   繁体   English

C语言中的这种排序算法有什么问题?

[英]What is wrong with this sorting algorithm in C?

I was attempting a problem where I had to arrange a list of 5000 first names alphabetically (The names were stored in a text file "names.txt" ). 我正在尝试一个必须按字母顺序排列5000个名字的列表的问题(这些名字存储在文本文件"names.txt" )。 As seen from my code below, I created a 2D array names[n][m] to store the names. 从下面的代码可以看出,我创建了一个二维数组names[n][m]用于存储名称。

For every single name, I compare it against all the other names alphabetically. 对于每个名称,我将其与所有其他名称按字母顺序进行比较。 Whenever the i-th name is alphabetically larger than another, there will be an increment to its alphabetical ranking stored in its array element rank[i] . 只要第i个名字按字母顺序大于另一个,存储在其数组元素rank[i]按字母顺序排列的排名就会增加。 For example, when "Mary" is compared to "Denise" , Mary's rank will be incremented by 1 since it is alphabetically larger than Denise. 例如,当将"Mary""Denise"进行比较时,玛丽的排名将增加1,因为它的字母顺序比丹妮丝大。 All the ranks start from 1. 所有等级从1开始。

This seemed to work as it was successful when tested with the example provided in the question. 当用问题中提供的示例进行测试时,这似乎是成功的,因为它是成功的。 However, the final answer I obtained was incorrect. 但是,我得到的最终答案是错误的。 More importantly, I discovered that several of the names shared the same ranking (ie I checked and found that "Columbus" and "Colt" both have the same ranking). 更重要的是,我发现几个名称共享相同的排名(即,我检查并发现"Columbus""Colt"都具有相同的排名)。 I'm not sure why or where my algorithm is flawed, as it seems logically sound(?) to me. 我不确定我的算法为何存在缺陷或在何处存在缺陷,因为在逻辑上对我来说听起来很合理(?)。 I tried making my code more readable by adding a few comments, and I would appreciate if someone could explain my mistake to me. 我尝试通过添加一些注释来使代码更具可读性,如果有人可以向我解释我的错误,我将不胜感激。 I've only been coding for about a few days, so forgive me if I had committed any rookie mistakes. 我只编码了几天,所以如果我犯了任何菜鸟错误,请原谅我。 Thanks for your time! 谢谢你的时间!

Link to problem: https://projecteuler.net/problem=22 链接到问题: https : //projecteuler.net/problem=22

EDIT: The code is slightly truncated (I omitted the final step where I just added all the scores together). 编辑:代码被略微截断了(我省略了最后一步,我只是将所有分数加在一起)。 But the error I talked about is only relevant to the provided code. 但是我所说的错误仅与提供的代码有关。 Thanks! 谢谢!

#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>

int main() {
    FILE *fp;
    int i, j;
    int a = 0;
    int b = 0;
    fp = fopen("names.txt", "r");
    char names[5200][30] = { 0 };
    int rank[5200] = { 0 }; //Rank corresponds to their alphabetical positions
    unsigned long long score[5200] = { 0 };
    unsigned long long sum = 0;
    for (i = 0; i < 5200; i++) {
        (rank[i])++;  //All the rankings start from 1.
    }
    for (i = 0; !feof(fp); i++) {
        fscanf(fp, "\"%[A-Z]\",", &names[i]); //Scanning and storing the names from the file into the array.
    }

    for (i = 0; i < 5200; i++) {
        for (j = 0; j < 5200; j++) {
            if (i != j && names[i][0] != 0 && names[j][0] != 0) {
                while (names[i][a] == names[j][a]) {  //If the ith word and jth word have the same letter, then increment a (which advances to the next letter).
                    a++;
                }
                if (names[i][a] > names[j][a]) { 
                    (rank[i])++; //If the ith word has a larger letter than the jth word, there will be an increase in its rank.
                } else
                if (names[j][a] == 0 && names[i][a] != 0) { 
                    (rank[i])++; //If the jth word is shorter than the ith word, then i also increases its rank.
                }
            }
            a = 0;
        }
        for (a = 0; a < 30; a++) {
            if (names[i][a] != 0 && names[i][0] != 0) {
                score[i] += (names[i][a] - 64); //Sum up the alphabetical value (as per the question) for each name.
            }
        }
        score[i] = (rank[i]) * (score[i]);
    }

Your algorithm works but there are a few implementation problems: 您的算法有效,但存在一些实现问题:

  • the test for (i = 0; !feof(fp); i++) { is incorrect. for (i = 0; !feof(fp); i++) {的测试不正确。 fscanf() may fail to convert the file contents before the end of file, causing an infinite loop. fscanf()可能无法在文件结尾之前转换文件内容,从而导致无限循环。 You should instead test if fscanf() returns 1 for success. 相反,您应该测试fscanf()是否成功返回1
  • you should count the number of words read into the array and restrict the loops to this range. 您应该计算读入数组的单词数并将循环限制在此范围内。
  • you should not assume that names are not duplicated in the file. 您不应假定文件中的名称没有重复。 The loop while (names[i][a] == names[j][a]) { a++ } has undefined behavior if i and j have the same contents. 如果ij的内容相同, while (names[i][a] == names[j][a]) { a++ }循环将具有未定义的行为。 Indeed using strcmp() to compare the names is simpler and safer. 确实,使用strcmp()比较名称更简单,更安全。
  • there is no need to keep the ranks and scores of all names, you can compute the sum on the fly inside the outer loop. 无需保留所有名称的等级和分数,您可以在外循环内即时计算总和。 This saves the initialization code too. 这也节省了初始化代码。

Here is a corrected and simplified version: 这是一个经过纠正和简化的版本:

#include <stdio.h>
#include <string.h>

int main() {
    char names[5200][30];
    FILE *fp;
    int i, j, n, a, rank, score;
    unsigned long long sum = 0;

    fp = fopen("p022_names.txt", "r");
    if (fp == NULL)
        return 1;

    // Scan and store the names from the file into the array.
    for (n = 0; n < 5200; n++) {
        if (fscanf(fp, "\"%29[A-Z]\",", names[n]) != 1)
            break;
    }
    fclose(fp);

    // n first names read from file.
    for (i = 0; i < n; i++) {
        rank = 1;
        for (j = 0; j < n; j++) {
            if (strcmp(names[i], names[j]) > 0)
                rank++;
        }
        score = 0;
        for (a = 0; names[i][a] != '\0'; a++) {
            // Sum up the alphabetical value (as per the question) for each name.
            score += names[i][a] - 'A' + 1;
        }
        sum += rank * score;
    }
    printf("Total score of %d names: %lld\n", n, sum);
    return 0;
}

Output: 输出:

Total score of 5163 names: 871198282

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM