简体   繁体   中英

Remove punctuation at beginning and end of a string

I have a string and I want to remove all the punctuation from the beginning and the end of it only, but not the middle.

I have wrote a code to remove the punctuation from the first and last character of a string only, which is clearly very inefficient and useless if a string has 2 or more punctuations at the end.

Here is an example:

{ Hello ""I am:: a Str-ing!! }

Desired output

{ Hello I am a Str-ing }

Are there any functions that I could use? Thanks.

This is what I've done so far. I'm actually editing the string in a linked-list

if(ispunct(removeend->string[(strlen(removeend->string))-1]) != 0) { 
    removeend->string[(strlen(removeend->string))-1] = '\0'; 
} 
else {} 

遍历字符串,使用isalpha()检查每个字符,将传递给新字符串的字符写入。

Iterate over the string, use isalpha() to check each character, after the first character that passes start writing into a new string.

Iterate over the new string backwards, replace all punctuation with \\0 until you find a character which isn't punctuation.

Ok, in a while iteration, call multiple times the strtok function to separate each single string by the character (white space). You could also use sscanf instead of strtok .

Then, for each string, you have to do a for cycle, but beginning from the end of the string up to the beginning.As soon as you encounter !isalpha(current character) put a \\0 in the current string position. You have eliminated the tail's punctuation chars.

Now, do another for cycle on the same string. Now from 0 to strlen(currentstring) . While is !isalpha(current character) continue . If isalpha put the current character in in a buffer and all the remaining characters. The buffer is the cleaned string. Copy it into the original string.

Repeat the above two steps for the others strtok 's outputs. End.

char *rm_punct(char *str) {
  char *h = str;
  char *t = str + strlen(str) - 1;
  while (ispunct(*p)) p++;
  while (ispunct(*t) && p < t) { *t = 0; t--; }
  /* also if you want to preserve the original address */
  { int i;
    for (i = 0; i <= t - p + 1; i++) {
      str[i] = p[i];
  } p = str; } /* --- */

  return p;
}
#include <stdio.h>
#include <ctype.h>
#include <string.h>

char* trim_ispunct(char* str){
    int i ;
    char* p;

    if(str == NULL || *str == '\0') return str;
    for(i=strlen(str)-1; ispunct(str[i]);--i)
        str[i]='\0';
    for(p=str;ispunct(*p);++p);

    return strcpy(str, p);
}

int main(){
    //test
    char str[][16] = { "Hello", "\"\"I", "am::", "a", "Str-ing!!" };
    int i, size = sizeof(str)/sizeof(str[0]);
    for(i = 0;i<size;++i)
        printf("%s\n", trim_ispunct(str[i]));

    return 0;
}
/* result:
Hello
I
am
a
Str-ing
*/

Construct a tiny state machine. The cha2class() function divides the characters into equivalence classes. The state machine will always skip punctuation, except when it has alphanumeric characters on the left and the right; in that case it will be preserved. (that is the memmove() in state 3)

#include <stdio.h>
#include <string.h>

#define IS_ALPHA 1
#define IS_WHITE 2
#define IS_PUNCT 3
int cha2class(int ch);
void scrutinize(char *str);

int cha2class(int ch)
{
if (ch >= 'a' && ch <= 'z') return IS_ALPHA;
if (ch >= 'A' && ch <= 'Z') return IS_ALPHA;
if (ch == ' ' || ch == '\t') return IS_WHITE;
if (ch == EOF || ch == 0) return IS_WHITE;
return IS_PUNCT;
}

void scrutinize(char *str)
{
size_t pos,dst,start;
int typ, state ;

state = 0;
for (dst = pos = start=0; ; pos++) {
        typ = cha2class(str[pos]);
        switch(state) {
        case 0: /* BOF, white seen */
                if (typ==IS_WHITE) break;
                else if (typ==IS_ALPHA) { start =  pos; state =1; }
                else if (typ==IS_PUNCT) { start =  pos; state =2; continue;}
                break;
        case 1: /* inside a word */
                if (typ==IS_ALPHA) break;
                else if (typ==IS_WHITE) { state=0; }
                else if (typ==IS_PUNCT) { start =  pos; state =3;continue; }
                break;
        case 2: /* inside punctuation after whitespace: skip it */
                if (typ==IS_PUNCT) continue;
                else if (typ==IS_WHITE) { state=0; }
                else if (typ==IS_ALPHA)  {state=1; }
                break;
        case 3: /* inside punctuation after a word */
                if (typ==IS_PUNCT) continue;
                else if (typ==IS_WHITE) { state=0; }
                else if (typ==IS_ALPHA) {
                        memmove(str+dst, str+start, pos-start); dst += pos-start;
                        state =1; }
                break;
                }
        str[dst++] = str[pos];
        if (str[pos] == '\0') break;
        }
}
int main (int argc, char **argv)
{
char test[] = ".This! is... ???a.string?" ;

scrutinize(test);

printf("Result=%s\n", test);

return 0;
}

int main (int argc, char **argv)
{
char test[] = ".This! is... ???a.string?" ;

scrutinize(test);

printf("Result=%s\n", test);

return 0;
}

OUTPUT:

Result=This is a.string

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM