简体   繁体   中英

Eliminating all extra spaces in a string

Function has to eliminate all extra spaces between the words and punctuation marks but punctuation marks must not have any space between them and the word before them.

for example i have this string:

Hey   ,how   are you today    ?

and i should get this:

Hey, how are you today?

this function eliminates extra spaces in here. But I don't know how to include the punctuation marks

btw I am caling this function in the main function

void space_rule(char *str[]){
printf("Enter a string: ");
gets(*str);
puts(*str);

char *p = strtok(*str, " ");
    while(*p!=NULL){
    printf("%s ",p);
    p=strtok(NULL, " ");
}

}

Consider:

#include <stdio.h>
#include <ctype.h>

int main()
{
    const char* input = "Hey   ,how   are you today    ?" ;
    const char* chp = input ;
    
    // While not end of input...
    while( *chp != '\0' )
    {
        
        // Print all regular characters (not space or punctuation)
        if( !(isspace( *chp ) || ispunct( *chp )) )
        {
            putchar( *chp ) ;
            chp++ ;
        }
        // Print all punctuation followed by a space (except at end)
        else if( ispunct( *chp ) )
        {
            putchar( *chp ) ;
            chp++ ;
            if( *chp != '\0' && !isspace( *chp ) )
            {
                putchar( ' ' ) ;
            }
        }
        // Is space...
        else if( isspace( *chp ) )
        {
            // Skip all space
            while( *chp != '\0' && isspace( *chp ) )
            {
                chp++ ;
            }
            
            // If not end, and not punctuation...
            if( *chp != '\0' && !ispunct( *chp ) )
            {
                // ...insert single space 
                putchar( ' ' ) ;
            }
        }
    }
    
    return 0;
}

However it is likely that you need to refine your rules more carefully. Perhaps not all punctuation should be treated the same? For example for:

const char* input = "She said \"Hello\"" ;

The output is:

She said" Hello"  

which is unlikely to be intended. For that not only would you need to include an exception rule for '"' , you'd need to account for opening and closing quotes and apply the rule accordingly. An exercise for the reader - I suggest you post a new question if you remain stuck with that.

It gets really complicated if you have a string such as:

She said, "Hey, how are, you today?"

Because then you have ? and " together, for ? you want a rule: " space after punctuation unless next character is also punctuation ". But if the ? and " are separated by space you have to eliminate that first before making the decision. To be honest, I gave up trying to figure out all the permutations. I would suggest that if you want to do that you perform the transform in multiple passes applying one rule at a time, for example:

  1. Reduce all multiple spaces to 1 space
  2. Insert space after punctuation if no space is present, and it is not an opening quite ( ' or " ).
  3. Remove space between punctuation
  4. Insert space before opening quote if no space is present.

By having simpler rules (executed in separate functions) and multiple passes, it will be much easier to get right at the expense of efficiency. You can also the easily re-order the application of the rules to get the intended result. For example here rule 4 must be applied after rule 3.

The changes you want can all be expressed as changes based on the types of each pair of adjacent characters. You need to characterize each character as a space, a letter, or punctuation, and deal with pairs as follows:

  • <space><space>: remove one (the first?) space
  • <space><punct>: remove the space
  • <punct><letter>: insert a space between them
  • otherwise: leave it alone.

Since these pair rules never do anything to the second character in the pair, this suggests a simple loop copying characters from an input buffer to an output buffer, doing those changes as you go:

void copy_and_fix(char *out, const char *in) {
    while (*in) {
        if (isspace(in[0]) && isspace(in[1])) {
            // don't copy the first space
        } else if (isspace(in[0]) && ispunct(in[1])) {
            // don't copy the space
        } else if (ispunct(in[0]) && iswordchar(in[1])) {
            // keep the punct and insert a space;
            *out++ = *in;
            *out++ = ' ';
        } else {
            // just copy the character
            *out++ = *in;
        }
        ++in;
    }
    *out = '\0';  // NUL terminate the output.
}

Of course, you need to be careful about the sizes of your buffers, to make sure you don't overflow the output buffer, but that can be dealt with a variety of ways (ensure the output buffer is at least 1.5 the size of the input, or pass a size and truncate the output if necessary, or allocate the output buffer on the heap and resize as needed).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM