简体   繁体   English

程序查找字长统计

[英]Program to find Word Length Statistics

I am expected to make a program to calculate and display statistics about the length of words in a text file. 我应该做一个程序来计算和显示有关文本文件中单词长度的统计信息。 I have been provided the following file 提供了以下文件

int readFile(const char fName[], char textStr[]){
    FILE *fPtr;
    char ch;
    int size = 0;

    if ((fPtr = fopen(fName, "r")) == NULL) {
        fprintf(stderr, "Error, failed to open %s: ", fName);
        perror("");
        return 1;
    }

    while ((ch = fgetc(fPtr)) != EOF) {
        if (size >= MAX_FILE - 1)
            break;
        textStr[size++] = ch;
    }

    textStr[size] = '\0';

    return size;
}

I was able to verify that I can access the file using the following code 我可以使用以下代码验证是否可以访问该文件

int main() {
    char str[MAX_FILE];
    int len = readFile("test.txt", str);
    if (len == -1) {
        printf("An error occurred\n");
    } else {
        printf("file read");
    }
}

File test.txt contains 文件test.txt包含

The quick brown fox jumps over the lazy dog

What I want to do is to get the contents of test.txt and find the length of each word in it something like:- 我想要做的是获取test.txt的内容,并找到其中每个单词的长度,例如:

1 letter words- 0
2 letter words - 0
3 letter words - 3
4 letter words -4

and so on... 等等...

As a fellow new contributor, I'm going to give you a break and try to answer the question you didn't ask. 作为新的贡献者,我将让您休息一下,尝试回答您未曾提出的问题。 ;) ;)

I believe the question is "how to proceed". 我相信问题是“如何进行”。 This is going to be a long answer as I will try be very detailed since you seem to be a newbie. 这将是一个很长的答案 ,因为您似乎是新手,所以我会尽力详细介绍。 Hopefully this will help you or maybe someone else. 希望这会对您或其他人有所帮助。

The trick is to take a word problem and convert it into a mathematical solution. 诀窍是解决单词问题并将其转换为数学解决方案。 The best way to do this is to write "pseudocode". 最好的方法是编写“伪代码”。 (See Wikipedia for more information, if you need to.) I'm going to give you some pseudocode at the end, but since this appears to be a homework assignment, please try to write your own pseudocode first. (如果需要,请参阅Wikipedia以获取更多信息。)最后我将给您提供一些伪代码,但是由于这似乎是一项家庭作业,请尝试首先编写自己的伪代码。 If you read the pseudocode and it still doesn't help, I can post my solution later. 如果您阅读了伪代码,但仍然没有帮助,我可以稍后发布解决方案。 (I'm not a great programmer so it might not be the best program. And it took way overlong to come up with it.) (我不是一个优秀的程序员,所以它可能不是最好的程序。花了很长时间才提出来。)

First things first: There appears to be a typo in the code you posted. 首先,第一件事:您发布的代码中似乎有一个错字。 In the source code you were provided, the problem is the return 1 statement if the file isn't found. 在提供的源代码中,问题是如果找不到该文件,则return 1语句。 That should be return -1 , because what would happen if you had a test file that had exactly 1 letter? 那应该是return -1 ,因为如果您的测试文件正好有1个字母,会发生什么? The code wouldn't work correctly. 该代码将无法正常工作。

Now, to first convert the word problem you were given: You need to have an array of word counts to keep track of 1-letter, 2-letter, etc. words. 现在,首先要转换单词问题,您将获得:您需要一个单词计数数组来跟踪1个字母,2个字母等单词。 Now according to this the longest word in the English dictionary is 45 letters. 现在根据这个在英文字典中最长的单词是45个字母。 So, in theory, you would need to have an array of 45 elements of wordCounts . 因此,从理论上讲,您将需要由45个wordCounts元素组成的wordCounts You can shorten this as required. 您可以根据需要缩短此时间。

Now to process your str variable, you need a while statement to go through one character at a time. 现在要处理您的str变量,您需要一个while语句一次遍历一个字符。 Since the characters in the string go from element 0 through one less that the len variable, you need to code the while accordingly. 由于字符串中的字符从元素0到len变量少一个,因此您需要相应地对while进行编码。

Within that while , you need another while . 在那while ,您需要 while This while needs to count up the wordLength one character at a time, until you see a blank or the trailing '/0' character of str . 这虽然需要一次将wordLength计数一次,直到您看到str的空白或结尾的“ / 0”字符。 To do this, you initialize the wordLength to zero right before the second while. 为此,您可以在第二秒之前将wordLength初始化为零。 Then add 1 to the wordLength for each character you count and increment your subscript . 然后为您计数的每个字符在wordLength上添加1并增加subscript

At the end of this inner while you need to accumulate your wordCounts. 在这种内结束while你需要积累你的wordCounts。 Keep in mind that your 1-letter words are going to be accumulated into element 0 of your array. 请记住,您的1个字母的单词将被累加到数组的元素0中。 So you need to adjust the wordLength - 1 array element. 因此,您需要调整wordLength - 1数组元素。 After that you need to increment your subscript you are using to go through your str , one character at a time. 之后,您需要增加您的下标,以用来一次遍历str ,每个字符一个。

At the end, you need to print out the wordCounts array values. 最后,您需要打印出wordCounts数组值。 Since most of the word lengths will have a value of zero, I wouldn't print these. 由于大多数单词长度的值为零,因此我不会打印这些。 Unless you set the maximum length of the wordCounts array to something like 10, instead of 45. You want a for loop to go through your wordCounts array, and do something like this: printf("%2d letter words = %d", ..., ...); 除非将wordCounts数组的最大长度设置为10(而不是45)左右。否则,您需要一个for循环遍历wordCounts数组,然后执行以下操作: printf("%2d letter words = %d", ..., ...); ,。 printf("%2d letter words = %d", ..., ...); . Keep in mind your 1-letter words are going to be in element 0; 请记住,您的1个字母的单词将位于元素0中;

That is a very detailed version of a word problem that is the solution to the problem of "count the number of words that the phrase has from 1-letter words to x -letter words". 这是单词问题的非常详细的版本,它是“计算短语从1个字母的单词到x个字母的单词的单词数”问题的解决方案。

Here is the pseudocode I came up with, after coding my solution. 在对解决方案进行编码之后,这是我想出的伪代码。 It is a little more detailed than normal pseudocode would be. 它比普通的伪代码要详细一些。 (Personally, I abbreviate all variable names and use Pascal case, but that's just me.) (就我个人而言,我缩写所有变量名,并使用Pascal大小写,但这就是我。)

Declare a numeric array of wordCounts and a subscript . 声明wordCountssubscript数字数组。

For each element of wordCounts, zero out the number of words or the code won't work right. For wordCounts的每个元素,将单词数量清零,否则代码将无法正常工作。

Reinitialize subscript to zero. subscript重新初始化为零。

As long as ( while ) the subscript is less than the len , continue. 只要( while )下标小于len ,就继续。

Initialize the wordLength to zero. wordLength初始化为零。

As long as the str[subscript] is not a blank or a null character, add 1 to the wordLength. 只要str[subscript]不是空白或空字符,请在wordLength上加1。

Increment the subscript. 增加下标。

After both while statements are complete print out the array of wordLengths, as described above. 在两个while语句完成之后,如上所述打印出wordLengths数组。

Your done! 大功告成!

Now I could post the actual code that could be used to come up with this pseudocode, but it would be better if you came up with it yourself. 现在,我可以发布可用于编写此伪代码的实际代码,但是如果您自己提出,那会更好。 If you try but have a bug in your code, post a new question, and I'll try to check back to answer it. 如果您尝试执行但代码中有错误,请发布一个新问题,我将尝试再次进行回答。 Hope this helps you or someone else. 希望这对您或其他人有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM