简体   繁体   中英

Trim Leading and Trailing whitespace from string in C

I've been searching stack overflow for help on my function void strip(char *s) and haven't found a suitable answer fitting my requirements. Every implementation I have tried has resulted in a SEGFAULT .

My idea is to increment the char pointer s until all whitespace is eaten. Then I have a pointer to the end of the string end . I decrement this pointer until all white space is eaten up as well. I have plugged my code into gdb to make sure everything points correctly which it does. However, when I try to cut off the whitespace by inserting a null character to delineate the end of the new substring, I can't seem to get it to work.

SEE EDIT:

void strip(char *s) {
  char *end;

  int length = strlen(s);
  end = s + length - 1;

  while (1) {
    if (*s == ' ' || *s == '\t' || *s == '\n') {
      s++;
    } else {
      break;
    }
  }

  while (1) {
    if (*end == ' ' || *end == '\t' || *end == '\n') {
      // *end = '\0'; my current implementation has this uncommented
      end--;
    } else {
      break;
    }
  }
}

int main(void) {
  char *string = "\tHello, World\t";
  strip(string);

  printf("%s\n", string);
}

From my understanding at the very least the main function should print Hello, World\\t . with the trailing tab still in place. However, the console outputs \\tHello, World\\t , as if the pointer was never moved.

Here's implementation number 2. This one uses two pointers for the start and end. Then creates a whole new string for s to point to. This one also results in just \\tHello, World\\t printed out. GDB on the other hand sees temp as Hello, World .

void strip(char *s) {
  char *start, *end;

  int length = strlen(s);
  end = s + length - 1;

  while (1) {
    if (*start == ' ' || *start == '\t' || *start == '\n') {
      start++;
    } else {
      break;
    }
  }

  while (1) {
    if (*end == ' ' || *end == '\t' || *end == '\n') {
      end--;
    } else {
      break;
    }
  }

  char temp[length];

  int index = 0;

  while (start <= end) {
    temp[index++] = *start;
    start++;
  }

  temp[index] = '\0';

  s = temp;
}

EDIT

void strip(char *s)
{
    char *end;

    int length = strlen(s);
    end = s + length - 1;

    // while a character is a space AND didn't reach the end
    while (s != NULL && isspace(*s))
    {
        s++;
    }

    // While end has not passed ptr s AND is a space
    while (end > s && isspace(*end))
    {
        // replace the whitespace
        *end = '\0';
        end--;
    }
}


int linesff(const char *s, char **lines)
{
    FILE *fp;

    if ((fp = fopen(s, "r")) == NULL)
    {
        printf("Cannot open file: %s\n", s);
        return -1;
    }

    // max buffer size
    char buf[MAX_C];
    int index = 0;

    // while Not EOF
    while (!feof(fp))
    {
        // get a line to put in buf
        fgets(buf, MAX_C, fp);

        // strip buf
        strip(buf);

        // if buf is greater than 1, meaning line has characters left
        if (strlen(buf) > 1)
        {
            // add buf at index
            lines[index] = buf;
            index++;
        }
    }

    // return the count
    return index;
}

Strip is a helper function for linesff that will strip whitespace. The **lines is created as such:

for (int i = 0; i < MAX_L; i++)
    lines[i] = malloc(MAX_C);

I am still unable to modify buf to hold the correctly stripped string.

There is some issues. temp will be deallocated as soon as the program leaves from strip function, so s will point to invalid memory. But s is also local to this function so setting it to anything at the end is pointless. What you need to do is allocating memory at the heap using malloc for example and copy temp content to it and return the pointer to this memory location. Allocated memory in the heap will lives until you not free them by calling free(...) on them.

This way it will work

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

char* strip(char *s) {
  char *start, *end;

  int length = strlen(s);
  start = s;
  end = s + length - 1;

  while (1) {
    if (*start == ' ' || *start == '\t' || *start == '\n') {
      start++;
    } else {
      break;
    }
  }

  while (1) {
    if (*end == ' ' || *end == '\t' || *end == '\n') {
      end--;
    } else {
      break;
    }
  }

  char temp[length + 1];
  int index = 0;

  while (start <= end) {
    temp[index++] = *start;
    start++;
  }

  temp[index] = '\0';

  char *result = malloc(strlen(temp) + 1);
  if (result == NULL)  // check if malloc failed
    return NULL;

  strcpy(result, temp);
  return result;
}

int main(void) {
  char *string = "\tHello, World\t";
  char *result_string = strip(string);

  // check the return value, if it NULL then something went wrong
  if (result_string != NULL) {
    printf("%s\n", result_string);
    free(result_string);
  } else {
    printf("Error occured\n");
  }
  return 0;
}
char *string = "\tHello, World\t";

If you do this, string will be a pointer to a constant string, which means you can't change the string content with *pointer .

In order to change the string, you must allocate string in a stack or heap like:

char string[100] = "\tHello, World\t";

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM