简体   繁体   中英

How to find number of occurrences in array of chars in C?

I am trying to enter a word, and get how many times the letters were typed.

Say my input is "hello"

my output would be: h = 1, e = 1 l = 2 etc.

I am very close to getting it right, but I have a small issue with this code:

#include <stdio.h>
#include <string.h>

void find_frequency(char s[], int count[]) {
    int c = 0;

    while (s[c] != '\0') {
        if (s[c] >= 'a' && s[c] <= 'z' )
            count[s[c]-'a']++;
        c++;
    }
}

int main()
{
    char string[100];
    int c, count[26] = {0};

    printf("Input a string\n");
    gets(string);

    find_frequency(string, count);

    printf("Character Count\n");

    for (c = 0 ; c < 26 ; c++)
        if(count[c] > 0)
            printf("%c : %d\n", c + 'a', count[c]);
    return 0;
}

This code does half of the job, but not all.

It's output is in alphabetical order. How can i change it to give me an output of just the chararray that is input?

As Ry- suggested in this comment you could iterate back over the original string and use the chars as indices into your frequency table. Something like the following:

int len_string = strlen(string);

for (c=0; c<len_string; c++) {
  char ch = string[c];
  printf("%c: %d, ", ch, count[ch-'a']);
}

This won't completely match your expected output, since this code will output l: 2 twice, but that raises the question:

What is your expected output when you have a string like abba ? a:2, b:2 ? a:1, b:2, a:1 ? a: 2, b:2, a:2 ? It's hard to help when you ask such an ambiguous question.

#include <stdio.h>
#include <string.h>

size_t ASCIIfreq[256];

void CountASCII(void *buff, size_t size)
{
    unsigned char *charsptr = buff;

    memset(ASCIIfreq, 0, sizeof(ASCIIfreq));
    while(size--)
    {
        ASCIIfreq[*charsptr++]++;
    }
}

void print(int printall)
{
    for(size_t index = 0; index < 256; index++)
    {
        if(ASCIIfreq[index] || printall)
        {
            printf("The %03zu (0x%02zx) ASCII - '%c' has occured in the buffer %zu time%c\n", 
                    index, index, (index > 32 && index < 127) ? (char)index : ' ',
                    ASCIIfreq[index], ASCIIfreq[index] == 1 ? ' ' : 's');
        }
    }
}

int main()
{
    char teststring[] = "i am trying to enter a word, and get how many times the letters were typed. Say my input is \"hello\" my output would be: h = 1, e = 1 l = 2 etc.I am very close to getting it right, but i have a small issue with this code";

    CountASCII(teststring, sizeof(teststring));
    print(0);

    return 0;
}

It's not clear what you mean by:

How can i change it to give me an output of just the chararray that is input?

Because that's exactly what you're doing in any case: Inputting a char array to the function; which is updated with numbers alphabetically; and later output as is.

So I'm guessing that you want to output the counts in the same order that each char was first encountered?


Solution

This will require a bit more work. You could keep a second array tracking the the order each character is encountered within find_frequency . But then that simple clean function starts doing too much.

So consider rather tweaking how you do the output:

void output_frequency(char s[], int count[]) {
    int c = 0;

    //loop s for the output
    while (s[c] != '\0') {
        if (s[c] >= 'a' && s[c] <= 'z' ) {
            //found a character, report the count only if not reported before
            if (count[s[c]-'a'] > 0) {
                printf("%c : %d\n", s[c], count[s[c] - 'a']);
                count[s[c]-'a'] = 0; //so you don't report this char again
            }
        }
        c++;
    }
}

If you are attempting to get an in-order count instead of a count in alphabetical order, you simply need to coordinate the indexes of your count array with the order of characters in your input buffer. To do that, simply loop over all characters in your input buffer and make a second pass counting the number of times the current character occurs. This will give you an in-order count of the number of times each character occurs, eg

#include <stdio.h>
#include <string.h>

#define COUNT  128
#define MAXC  1024

int main (void) {

    char buf[MAXC] = "";                /* buffer to hold input */
    int count[COUNT] = {0};             /* array holding inorder count */

    fputs ("enter string: ", stdout);   /* prompt for input */

    if (!fgets (buf, MAXC, stdin)) {    /* read line into buf & validate */
        fputs ("error: EOF, no valid input.\n", stderr);
        return 1;
    }

    /* loop over each character not '\n' */
    for (int i = 0; buf[i] && buf[i] != '\n'; i++) {
        char *p = buf;          /* pointer to buf */
        size_t off = 0;         /* offset from start of buf */
        while ((p = strchr (buf + off, buf[i]))) {  /* find char buf[i] */
            count[i]++;         /* increment corresponding index in count */
            off = p - buf + 1;  /* offset is one past current char */
        }
    }
    for (int i = 0; count[i]; i++)  /* output inorder character count */
        printf (i ? ",  %c: %d" : "%c: %d", buf[i], count[i]);
    putchar ('\n');     /* tidy up with new line */

    return 0;
}

( note: strchr is used for convenience to simply find the next occurrence of the current character within the string and then off (offset) is used to start the search with the following character until no other matches in the string are found. You can simply use an additional loop over the characters in the buffer if you like.)

Example Use/Output

$ /bin/charcnt_inorder
enter string: hello
h: 1,  e: 1,  l: 2,  l: 2,  o: 1

However, this does recount each character and give the count again if the character is duplicated, (eg l: 2, l: 2 for each 'l' ). Now it is unclear from:

"my output would be: h = 1, e = 1 l = 2 etc."

what you intended in that regard, but with just a little additional effort, you can use a separate index and a separate array to store the first instance of each character (in say a chars[] array) along with the count of each in your count[] array and preserve your inorder count while eliminating duplicate characters. The changes needed are shown below:

#include <stdio.h>
#include <string.h>

#define COUNT  128
#define MAXC  1024

int main (void) {

    char buf[MAXC] = "",
        chars[COUNT] = "";              /* array to hold inorder chars */
    int count[COUNT] = {0};
    size_t cdx = 0;                     /* add count index 'cdx' */
    fputs ("enter string: ", stdout);

    if (!fgets (buf, MAXC, stdin)) {
        fputs ("error: EOF, no valid input.\n", stderr);
        return 1;
    }

    for (int i = 0; buf[i] && buf[i] != '\n'; i++) {
        char *p = buf;
        size_t off = 0;
        chars[cdx] = buf[i];            /* store in chars array */
        if (i) {                        /* if past 1st char */
            int n = i;
            while (n--)                 /* simply check all before */
                if (buf[n] == buf[i])   /* if matches current */
                    goto next;          /* bail and get next char */
        }
        while ((p = strchr (buf + off, buf[i]))) {
            count[cdx]++;               /* increment count at index */
            off = p - buf + 1; 
        }
        cdx++;                          /* increment count index */
        next:;                          /* goto label to jump to */
    }
    for (int i = 0; count[i]; i++)
        printf (i ? ",  %c: %d" : "%c: %d", chars[i], count[i]);
    putchar ('\n');

    return 0;
}

Example Use/Output

$ /bin/charcnt_inorder2
enter string: hello
h: 1,  e: 1,  l: 2,  o: 1

or

$ ./bin/charcnt_inorder2
enter string: amarillo
a: 2,  m: 1,  r: 1,  i: 1,  l: 2,  o: 1

Now your 'l' is only reported once with the correct count.

Note, in each example you should do additional validation to insure the entire input fit within your buffer, etc... The count (and chars ) array were sized at 128 to cover the entire range of ASCII values. Don't skimp on buffer size. If you explicitly limit your input to UPPERcase or lowercase -- then you can limit your count size to 26 , otherwise you need to consider the additional characters and punctuation that will be encountered. The same applies to your input buffer. If you anticipate you max input would be 500 chars, double it (generally to next available power of two, no real requirement for powers of two, but you are likely to see it that way).

Bottom line, I'd rather be 10,000 characters too long that one character too short... leading to Undefined Behavior .

Lastly, as mentioned in my comment never, never, never use gets . It is so insecure it has been removed from the C standard library in C11. Use fgets or POSIX getline instead.

Look things over and let me know if you have further questions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM