简体   繁体   English

在C中删除带有字符串修剪的空格,换行符和制表符

[英]Removing a space, newline, and tabs with string trimming in C

I was trying to gather sources from the internet to understand how it works and functions. 我试图从互联网上收集信息,以了解其工作原理和功能。 Basically, I will need to check a space, newline, and tabs every time that it reads a string. 基本上,每次读取字符串时,我都需要检查一个空格,换行符和制表符。 So I made a function that takes care of that case: 因此,我做了一个处理这种情况的函数:

#include <stdlib.h>
#include <stdio.h>

static int  isspace(char c)
{
    return (c == '\t' || c == '\n' || c == ' ');
}

Then, I use the function below to implement it in ANOTHER function 然后,我使用下面的函数在ANOTHER函数中实现它

char *my_strtrim(char const *string)
{
    char *i;
    char *s;
    int ready;

    i = s;
    s = (char *)string;
    ready = 0;
    while(*i)
    {
        ++i;
        if(isspace(*i))
        {
            if(!ready)
            {
                continue ;
            }
            ready = 0;
        }
        ready = 1;
        *(s++) = *i;
    }

    *s = 0;
    return ((char *)string);
}

For my main, I just made a random test case where it takes care of spaces, tabs, and newline: 对于我的主要工作,我只是做了一个随机测试用例,其中涉及空格,制表符和换行符:

int main()
{
    char str[] = "                      hello world\n !";
    printf("%s",my_strtrim(str));
}

There is an output error that there an error in my_strtrim function with i = s because s is has no result of NULL . 存在输出错误,因为i = s因此my_strtrim函数存在错误,因为sNULL结果。 The error says: 错误提示:

my_strtrim.c: error: variable 's' is uninitialized when used here [-Werror,-Wuninitialized]
        i = s;
            ^
my_strtrim.c: note: initialize the variable 's' to silence this warning
        char *s;
               ^
                = NULL

After I fix what it says (making s = NULL ) I get a segmentation fault. 解决了我所说的内容(使s = NULL )后,我得到了分段错误。 This problem has become confusing because it works fine as a for-loop, but not as a while loop. 这个问题变得令人困惑,因为它可以很好地用作for循环,但不能用作while循环。 I am required to do this problem as a while-loop. 我被要求做一个while循环的问题。

SOLUTION My friend gave me a little tip/rule of thumb, which is keeping code simple and nice to understand. 解决方案我的朋友给了我一些小建议/经验,使代码简单易懂。 I had a single cluster function doing a lot of things in the same; 我有一个集群功能,可以同时完成很多事情。 therefore, I was getting confused. 因此,我感到困惑。 He guided me and told me to make me to condense my whole function into little pieces. 他引导了我,并告诉我让我将整个功能浓缩成小块。

STEP 0: Initilize variables and declare 步骤0:初始化变量并声明

Self explanatory 自我说明

STEP 1: GET POSITION OF YOUR POINTER STRING 步骤1:定位您的指针

 int step1_getPosition(char const *string) { int i; i = 0; while(my_iswhitespace(string[i])) { i++; continue; } return (i); } 

STEP 2: COPY YOUR STRING 步骤2:复制您的STRING

 char *step2_copyString(char const *string, int pos) { char *tmp; int i; i = 0; tmp = my_strnew(my_strlen(string)); if(tmp == NULL) return (NULL); while (string[pos] != '\\0') tmp[i++] = string[pos++]; return (tmp); } 

STEP 3: REMOVE WHITE SPACES 第3步:删除白色空间

 char *step3_removeWhite(char *str) { int i; i = my_strlen(str); while (str[i] == '\\0' || my_iswhitespace(str[i])) { str[i] = '\\0'; i--; } return (str); } 

STEP 4: REMOVE EXTRA NULL-BYTES ('\\0') 步骤4:删除多余的空字节('\\ 0')

 char *step4_removeExtraNulls(char *str) { char *newstring; newstring = my_strdup(str); if(newstring == NULL) return (NULL); free(str); return (newstring); } 

STEP 5: MAIN FUNCTION CALLED WITH THE OTHER FUNCTIONS CREATED 第5步:调用其他功能创建主要功能

 char *my_strtrim(char const *string) { char *trim; int i; i = step1_getPosition(string); trim = step2_copyString(string, i); if (trim == NULL) return (NULL); step3_removeWhite(trim); trim = step4_removeExtraNulls(trim); if (trim == NULL) return (NULL); return (trim); } 

The output I get is: hello world ! 我得到的输出是: hello world ! Which this is correct 这是正确的

The simplest way to "trim" leading white-space characters, is to just skip over them, not to modify the string at all. “修剪”前导空格字符的最简单方法是跳过它们,根本不修改字符串。

This relies on the fact that a "string" in C can be expressed as a pointer to a sequence of null-terminated characters. 这依赖于以下事实:C语言中的“字符串”可以表示为指向以null终止的字符序列的指针。

Take for example your string 以您的字符串为例

char str[] = "                      hello world\n !";

If we let the array str decay to a pointer to its first element, it points to the first space. 如果我们让数组str衰减到指向其第一个元素的指针,则它指向第一个空间。 What if we had a pointer that pointed to the 'h' instead? 如果我们有一个指向'h'的指针怎么办? That would be an equally valid "string". 那将是同样有效的“字符串”。

To get that pointer, we just loop over the string, as long as the current character is a space (and not the terminator of course). 要获得该指针,我们只需在字符串上循环,只要当前字符是一个空格(当然不是终止符)即可。

Putting this into practice we get 将其付诸实践,我们得到

char *my_strtrim(char const *string)
{
    for (/* empty */; *string && my_isspace(*string); ++string)
    {
        // Empty
    }

    return string;
}

After the loop in the function above, the pointer string will point either to the terminator (if the string was just all space), or to the first non-space character in the string. 在上面的函数的循环后,将指针string将指向无论对终止子(如果字符串只是所有空间),或于字符串中的第一个非空格字符。

If we use it like 如果我们像这样使用

printf("%s\n", my_strtrim(str));

then it will print 然后它将打印

hello world
 !

[The embedded newline is because you have it in your string.] [嵌入的换行符是因为您在字符串中包含它。]

It should be noted that this doesn't trim trailing spaces. 请注意,这不会修剪尾随空格。 For that to be possible, the argument string can't be a pointer to constant characters. 为此,参数string不能是指向常量字符的指针。

The problem is happening because when yo're assigning the i = s the s variable is in an undefined condition. 发生问题是因为当您分配i = ss变量处于不确定状态。

Please, consider the below code: 请考虑以下代码:

char *my_strtrim(char const *string)
{
    char *i;
    char *s;
    int ready;

    s = (char *)string;
    i = s;
    ready = 0;
    while(*i)
    {
        ++i;
        if(isspace(*i))
        {
            if(!ready)
            {
                continue ;
            }
            ready = 0;
        }
        ready = 1;
        *(s++) = *i;
    }

    *s = 0;
    return ((char *)string);
}

Zeid, continuing from the comments. Zeid,继续评论。 One of the efficiency goals should be to limit the number of passes you make over the string (optimally to one). 效率目标之一应该是限制您在字符串上进行的通过次数(最好是一次)。 You should also consider passing a final array to hold the trimmed string, as there are many instances where the original will need to be preserved, or, as correctly noted by Some programmer dude, you cannot modify and parameter specified as const or one that resides in read only memory (such as a string literal ). 您还应该考虑传递一个最终数组来保存修剪后的字符串,因为在许多情况下需要保留原始字符串,或者正如某些程序员所正确指出的那样,您不能修改并将参数指定为const或驻留在其中的一个参数。在只读内存中 (例如字符串常量 )。

You can use a VLA for that purpose in the caller. 您可以在调用方中为此目的使用VLA。 Putting that altogether and adding an additional parameter to hold the result, 'r' below, you could do something like the following. 将其放在一起并添加一个附加参数以保存下面的结果'r' ,您可以执行以下操作。

It simply removes all leading whitespace, then backs up, removing all trailing whitespace, and checks whether the final character is anything other than an alnum character (since most sentences end with some type of punctuation). 它只是删除所有前导空格,然后进行备份,删除所有尾随空格,并检查最终字符是否不是alnum字符(因为大多数句子都以某种标点符号结尾)。 It then checks whether there is any additional whitespace between the punctuation and the last alnum char in the string removing any intervening whitespace by shuffling the end characters forward to overwrite any intervening whitespace found (this will get rid of your extra '\\n ' between world and '!' ) 然后,它检查标点符号和字符串中的最后一个alchar字符之间是否还有其他空格,通过向前拖尾字符以覆盖找到的任何中间空格来删除任何中间空格(这将消除world之间多余的'\\n ''!'

#include <stdio.h>
#include <ctype.h>

/** remove leading and trailing whitespace, original is preserved.
 *  this funciton can be used with or without assigning return.
 *  any intervening whitespace between end punctuation and first
 *  alpha-num character is also removed.
 */
char *strtrimws (char *r, const char *s)
{
    char *p = r;                            /* pointer to result         */
    *r = 0;                                 /* initialize as empty str   */
    if (!s) return NULL;                    /* validaate source str      */
    if (!*s) return r;                      /* empty str - nothing to do */
    while (isspace (*s))  s++;              /* skip leading whitespace   */
    while (*s) *p++ = *s++;                 /* fill r with s to end      */
    *p = 0;                                 /* nul-terminate r           */
    while (p > r && isspace (*--p)) *p = 0; /* overwrite spaces from end */
    while (p > r && !isalnum (*--p)) {      /* continue until 1st alnum  */
        if (isspace (*p)) {                 /* if spaces found           */
            char *rp = p, *wp = p;          /* set read & write pointers */
            while (*rp++) *wp++ = *rp;      /* shuffle end chars forward */
            *wp = 0;                        /* nul-terminate at new end  */
        }
    }
    return r;
}

int main (void) {

    char str[] = "                      hello world\n !",
        result[sizeof str];
    printf ("%s\n", strtrimws (result, str));
    return 0;
}

Example Use/Output 使用/输出示例

$ ./bin/trimws
hello world!

Look things over and let me know if you have any further questions. 仔细检查一下,如果您还有其他问题,请与我联系。 If not, good luck with your coding. 如果没有,请祝您编程愉快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM