简体   繁体   中英

How to remove punctuation from a String in C

我想从字符串中删除所有标点符号并在C中将所有大写字母小写,任何建议?

Loop over the characters of the string. Whenever you meet a punctuation ( ispunct ), don't copy it to the output string. Whenever you meet an "alpha char" ( isalpha ), use tolower to convert it to lowercase.

All the mentioned functions are defined in <ctype.h>

You can either do it in-place (by keeping separate write pointers and read pointers to the string), or create a new string from it. But this entirely depends on your application.

Just a sketch of an algorithm using functions provided by ctype.h :

#include <ctype.h>

void remove_punct_and_make_lower_case(char *p)
{
    char *src = p, *dst = p;

    while (*src)
    {
       if (ispunct((unsigned char)*src))
       {
          /* Skip this character */
          src++;
       }
       else if (isupper((unsigned char)*src))
       {
          /* Make it lowercase */
          *dst++ = tolower((unsigned char)*src);
          src++;
       }
       else if (src == dst)
       {
          /* Increment both pointers without copying */
          src++;
          dst++;
       }
       else
       {
          /* Copy character */
          *dst++ = *src++;
       }
    }

    *dst = 0;
}

Standard caveats apply: Completely untested; refinements and optimizations left as exercise to the reader.

The idiomatic way to do this in C is to have two pointers, a source and a destination, and to process each character individually: eg

#include <ctype.h>

void reformat_string(char *src, char *dst) {
    for (; *src; ++src)
        if (!ispunct((unsigned char) *src))
            *dst++ = tolower((unsigned char) *src);
    *dst = 0;
}

src and dst can be the same string since the destination will never be larger than the source.

Although it's tempting, avoid calling tolower(*src++) since tolower may be implemented as a macro.

Avoid solutions that search for characters to replace (using strchr or similar), they will turn a linear algorithm into a geometric one.

Here's a rough cut of an answer for you:

void strip_punct(char * str) {
    int i = 0;
    int p = 0;
    int len = strlen(str);
    for (i = 0; i < len; i++) {
        if (! ispunct(str[i]) {
            str[p] = tolower(str[i]);
            p++;
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM