简体   繁体   中英

Replacing characters in string with C

I am reading a book and it defines a function to replace characters in a char array like this:

void RemoveChars(char remove[], char str[])
{
   int src, dst, removeArray[256];
   for (src=0; src < 256; src++) {
     removeArray[src] = 0;
   }

   src = 0;
   while (remove[src]) {
     removeArray[remove[src]] = 1;
     src++;
   }

   src = dst = 0;
   do {
     if (!removeArray[remove[src]]) {
       str[dst++] = str[src];
     }
   } while (str[src++]);
}

My question here is, imagine that in remove[] we have b and in the str[] we have "hi", so:

str[0] = 'h' and str[1] = 1 .

From what I see in the code, we would do:

str[1] = str[0] --> str[1] = 'h'

But that means, we just overwrote the 'i', so we wouldn't be able to find it in the next iteration right?

What am I missing here?

There are a few glaring holes in that code as it stands. The first is the use of the naked char datatype which may be signed or unsigned. If it's signed then negative values are likely to cause serious problems when used as an array index.

The second problem is with the detection of whether a character is too be removed. You use !removeArray[remove[src]] to try and analyse whether a character in the source string should be removed. But it's not the remove array you should be checking, it's the src array.

Lastly, you're assuming that the char type is eight bits wide, hence will have 256 distinct values. That might be okay if you know it's the case but for truly portable code, you would use UCHAR_MAX from limits.h .

So a better starting point (with comments) would be:

void removeChars (unsigned char *remove, unsigned char *str) {
    size_t src, dst;
    unsigned char removeMap [UCHAR_MAX + 1];

    // Initial map is to preserve everything.

    memset (removeMap, 0, sizeof (removeMap));

    // For each character to be removed, change its map entry.

    while (*remove != '\0') {
        removeMap [*remove] = 1;
        remove++;
    }

    // Run two pointers through the array, source and destination.

    src = dst = 0;
    do {
        // Only if character allowed to survive will it be transferred.

        if (! removeMap [str [src]]) {
            str [dst++] = str [src];
        }

    // Finish when end of string transferred.

    } while (str [src++]);
}

Combining that with some test code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>

void removeChars (unsigned char *, unsigned char *);

char *mystrdup (char *s) {
    char *news = malloc (strlen (s) + 1);
    if (news != NULL)
        strcpy (news, s);
    return news;
}

int main (int argc, char *argv[]) {
    if (argc != 3) {
        printf ("Usage: testprog <string> <characters-to-remove>\n");
        return 1;
    }

    char *string = mystrdup (argv[1]);
    char *remove = mystrdup (argv[2]);

    removeChars (remove, string);

    printf ("Result is '%s'\n", string);

    free (string);
    free (remove);

    return 0;
}

and running it with:

testprog 'Pax is a really nice guy' Piul

gives you the expected output:

Result is 'ax s a reay nce gy'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM