简体   繁体   English

str[i - 1] == ' ' 是什么意思?

[英]What does str[i - 1] == ' ' mean?

I've been reviewing a program that capitalises the first letter of every word in a string.我一直在审查一个将字符串中每个单词的首字母大写的程序。 For example, "every single day" becomes "Every Single Day" .例如, "every single day"变成"Every Single Day"

I don't understand the part str[i - 1] == ' ' .我不明白部分str[i - 1] == ' ' What does that do?那有什么作用?

#include <stdio.h>

char    *ft_strcapitalize(char *str)
{
    int i;

    i = 0;
    while (str[i] != '\0')
    {
        if ((i == 0 || str[i - 1] == ' ') &&
                (str[i] <= 'z' && str[i] >= 'a'))
        {
            str[i] -= 32;
        }
        else if (!(i == 0 || str[i - 1] == ' ') &&
                (str[i] >= 'A' && str[i] <= 'Z'))
        {
            str[i] += 32;
        }
        i++;
    }
    return (str);
}

int   main(void)
{
  char str[] = "asdf qWeRtY ZXCV 100TIS";

  printf("\n%s", ft_strcapitalize(str));
  return (0);
}

i is the index in the string of the current character you are thinking about capitalising (remembering it starts at 0). i是您正在考虑大写的当前字符的字符串中的索引(记住它从 0 开始)。

i-1 is the index in the string of the previous character to the one you are considering. i-1是您正在考虑的前一个字符的字符串中的索引。

str[i-1] is the character in the position previous to the one you are considering. str[i-1]是您正在考虑的字符之前的 position 中的字符。

== ' ' is comparing that character to a space character. == ' '将该字符与空格字符进行比较。

So str[i-1] == ' ' means "Is the character to the left of this one a space?"所以str[i-1] == ' '意思是“这个左边的字符是空格吗?”

" What does str[i - 1] == ' ' mean? " str[i - 1] == ' '是什么意思?

' ' is a character constant for the white space character (ASCII value 32). ' '是空白字符的字符常量(ASCII 值 32)。

str is a pointer to char in the caller. str是调用者中指向char的指针。 (Practically thinking, it should point to an array of char with a string inside of it, not just a single char ). (实际上,它应该指向一个char数组,其中包含一个字符串,而不仅仅是一个char )。

i is a counter. i是柜台。


Note that the C syntax allows that you can use array notation for pointers.请注意,C 语法允许您对指针使用数组表示法。 Thus, str[1] is equal to *(str + 1) .因此, str[1]等于*(str + 1)

The [i - 1] in str[i - 1] means that you access the element before the element str[i] is pointing to. str[i - 1] [i - 1]中的 [i - 1] 表示您在元素str[i]指向之前访问该元素。

The element str[i - 1] is pointing to, is compared to the white space character (If the element str[i - 1] is pointing to actually contains white space).指向的元素str[i - 1]与空白字符进行比较(如果指向的元素str[i - 1]实际上包含空白)。

The condition evaluates to true if this is the case, else the condition is false .如果是这种情况,则条件评估为true ,否则条件为false


Side Notes:旁注:

  • Note that str[i - 1] can be dangerous when i == 0 .请注意,当i == 0时, str[i - 1]可能很危险。 Then you would try to access memory beyond the bounds of the pointed array.然后,您将尝试在指向数组的范围之外访问 memory。 But in your case, this is secure since str[i - 1] == ' ' is only evaluated, if i == 0 is not true , thanks to the logical OR ||但是在您的情况下,这是安全的,因为str[i - 1] == ' '仅在i == 0不是true时才被评估,这要归功于逻辑 OR || . .

     if ((i == 0 || str[i - 1] == ' ')

    So this case is considered in your code.因此,您的代码中考虑了这种情况。

  • str[i] -= 32; is equivalent to str[i] -= 'a' - 'A';相当于str[i] -= 'a' - 'A'; . . The latter form can improve readability as the capitalizing nature is brought to focus.后一种形式可以提高可读性,因为大写性质成为焦点。

It is checking for spaces, or more exactly, the line它正在检查空格,或者更准确地说,检查行

if ((i == 0 || str[i - 1] == ' ')

Checks if we are looking at the string beginning or its previous line was a space, that is, to check if a new word was encountered.检查我们是否正在查看字符串开头或其前一行是否为空格,即检查是否遇到了新单词。 In the string " e very single day", i = 0 at the bold position, and in the next case,在字符串“ e very single day”中, i = 0在粗体 position 处,在下一种情况下,
"every s ingle day", i = 6 and str[i-1] is ' ' marking a new word was encountered "every s ingle day", i = 6 and str[i-1] is ' '标记遇到一个新词

Here you are comparing str[i-1] with character space , Whose ASCII code is 32.在这里,您将str[i-1]与字符space进行比较,其 ASCII 码为 32。

eg例如

if(str[i-1] == ' ')
{
 printf("Hello, I'm space.\n");
}
else
{
 printf("You got here, into the false block.\n");
}

Execute this snippet and if the comparison yields the value 1 it's true, false otherwise.执行此代码段,如果比较结果为 1,则为真,否则为假。 Put str[] = "Ryan Mney";str[] = "Ryan Mney"; and then compare you'll understand, what is happening?再对比一下你就明白了,这是怎么回事?

The C-language provides a number of useful character macros that can be used to both make code more portable, and more readable. C 语言提供了许多有用的字符宏,它们可用于使代码更便携、更易读。 Although the sample code you are reviewing does not use these macros, please consider using these macros to make your code more portable, more robust, and easier for others to read.尽管您正在查看的示例代码不使用这些宏,但请考虑使用这些宏来使您的代码更便携、更健壮并且更易于其他人阅读。

Please use the islower/isupper/isalpha and tolower/toupper macros;请使用 islower/isupper/isalpha 和 tolower/toupper 宏; these ctype macros make C-language string processing easier to read.这些 ctype 宏使 C 语言字符串处理更易于阅读。

  • islower(ch) - check whether ch is lower case islower(ch) - 检查 ch 是否为小写
  • isupper(ch) - check whether ch is upper case isupper(ch) - 检查 ch 是否为大写
  • isalpha(ch) - check whether ch is alphabetic (lower or upper case) isalpha(ch) - 检查 ch 是否为字母(小写或大写)
  • tolower(ch) - convert ch to lower case (if it is alphabetic) tolower(ch) - 将 ch 转换为小写(如果是字母)
  • toupper(ch) - convert ch to upper case (if it is alphabetic) toupper(ch) - 将 ch 转换为大写(如果是字母)

Yes, they are macros - What is the macro definition of isupper in C?是的,它们是宏 - C 中 isupper 的宏定义是什么?

The C-language provides the 'for' control statement which provides a nice way to express string processing. C 语言提供了“for”控制语句,它提供了一种很好的方式来表达字符串处理。 Simple indexed loops are often written using 'for' rather than 'while'.简单的索引循环通常使用“for”而不是“while”来编写。

#include <ctype.h>

char*
ft_strcapitalize(char *str)
{
    for( int i=0; (str[i] != '\0'); i++ )
    {
        if ((i == 0 || isspace(str[i - 1])) && islower(str[i]) )
        {
            str[i] = toupper(str[i]);
        }
        else if (!(i == 0 || str[i - 1] == ' ') && isupper(str[i]) )
        {
            str[i] = tolower(str[i]);
        }
    }
     return (str);
}

A slight refactoring makes the code a bit more readable,轻微的重构使代码更具可读性,

char*
ft_strcapitalize(char *str)
{
    for( int i=0; (str[i] != '\0'); i++ )
    {
        if( (i == 0 || isspace(str[i - 1])) )
        {
            if( islower(str[i]) ) str[i] = toupper(str[i]);
        }
        else if( !(i == 0 || isspace(str[i - 1]) )
        {
            if( isupper(str[i]) ) str[i] = tolower(str[i]);
        }
    }
    return(str);
}

Alternately, use isalpha(ch),或者,使用 isalpha(ch),

char*
ft_strcapitalize(char *str)
{
    for( int i=0; (str[i] != '\0'); i++ )
    {
        if( (i == 0 || isspace(str[i - 1])) )
        {
            if( isalpha(str[i]) ) str[i] = toupper(str[i]);
        }
        else if( !(i == 0 || isspace(str[i - 1]) )
        {
            if( isalpha(str[i]) ) str[i] = tolower(str[i]);
        }
    }
    return(str);
}

Simplify the conditional expression even further, by performing the special case (first character of string) first.通过首先执行特殊情况(字符串的第一个字符)来进一步简化条件表达式。

char*
ft_strcapitalize(char *str)
{
    if( islower(str[0]) ) str[0] = toupper(str[0]);

    for( int i=1; (str[i] != '\0'); i++ )
    {
        if( isspace(str[i - 1]) )
        {
            if( islower(str[i]) ) str[i] = toupper(str[i]);
        }
        else if( !isspace(str[i - 1]) )
        {
            if( isupper(str[i]) ) str[i] = tolower(str[i]);
        }
    }
    return(str);
}

Again, the alternate isalpha(ch) version,同样,备用 isalpha(ch) 版本,

char*
ft_strcapitalize(char *str)
{
    if( isalpha(str[0]) ) str[0] = toupper(str[0]);

    for( int i=1; (str[i] != '\0'); i++ )
    {
        if( isspace(str[i - 1]) )
        {
            if( isalpha(str[i]) ) str[i] = toupper(str[i]);
        }
        else if( !isspace(str[i - 1]) )
        {
            if( isalpha(str[i]) ) str[i] = tolower(str[i]);
        }
    }
    return(str);
}

Even more idiomatic, just use a 'state' flag that indicates whether we should fold to upper or lower case.更惯用的是,只需使用一个“状态”标志来指示我们是否应该折叠为大写或小写。

char*
ft_strcapitalize(char *str)
{
    int first=1;
    for( char* p=str; *p; p++ ) {
        if( isspace(*p) ) {
            first = 1;
        }
        else if( !isspace(*p) ) {
            if( first ) {
                if( isalpha(str[i]) ) str[i] = toupper(str[i]);
                first = 0;
            }
            else {
                if( isalpha(str[i]) ) str[i] = tolower(str[i]);
            }
        }
    }
    return(str);
}

And your main test driver,而你的主要测试驱动程序,

int   main(void)
{
    char str[] = "asdf qWeRtY ZXCV 100TIS";

    printf("\n%s", ft_strcapitalize(str));
    return (0);
}

' ' is a character constant representing the value of the space character in the execution set. ' '是一个字符常量,表示执行集中空格字符的值。 Using ' ' instead of 32 increases both readability and portability to systems where space might not have the same value as in the ASCII character set.使用' '而不是32可以提高系统的可读性和可移植性,因为系统中空格的值可能与 ASCII 字符集中的值不同。 (i == 0 || str[i - 1] == ' ') is true if i is the offset of the beginning of a word in a space separated list of words. (i == 0 || str[i - 1] == ' ')如果i是空格分隔的单词列表中单词开头的偏移量,则为真。

It is important to try and make the as simple and readable as possible.重要的是要尽可能简单易读。 Using magic constants like 32 is not recommended when a more expressive alternative is easy and cheap.当更具表现力的替代方案既简单又便宜时,不建议使用像32这样的魔法常数。 For example you convert lowercase characters to uppercase with str[i] -= 32 : this magic value 32 (again.) happens to be the offset between the lowercase and the uppercase characters: It would be more readable to write:例如,您使用str[i] -= 32将小写字符转换为大写:这个神奇的值32 (再次。)恰好是小写字符和大写字符之间的偏移量:这样写起来更易读:

    str[i] -= 'a' - 'A';

Similarly, you wrote the range tests for lower case and upper case in the opposite order: this is error prone and surprising for the reader.类似地,您以相反的顺序编写了小写和大写的范围测试:这容易出错并且令读者惊讶。

You are also repeating the test for the start of word: testing for lower case only at the start of word and testing for upper case otherwise makes the code simpler.您还重复了单词开头的测试:仅在单词开头测试小写字母并测试大写字母,否则会使代码更简单。

Finally, using a for loop is more concise and less error prone than the while loop in your function, but I known that the local coding conventions at your school disallow for loops (.).最后,使用for循环比 function 中的while循环更简洁,更不容易出错,但我知道你们学校for本地编码约定不允许循环 (.)。

Here is a modified version:这是修改后的版本:

#include <stdio.h>

char *ft_strcapitalize(char *str) {
    size_t i;

    i = 0;
    while (str[i] != '\0') {
        if (i == 0 || str[i - 1] == ' ') {
            if (str[i] >= 'a' && str[i] <= 'z') {
                str[i] -= 'a' - 'A';
            }
        } else {
            if (str[i] >= 'A' && str[i] <= 'Z') {
                str[i] += 'a' - 'A';
            }
        }
        i++;
    }
    return str;
}

int main(void) {
    char str[] = "asdf qWeRtY ZXCV 100TIS";

    printf("\n%s", ft_strcapitalize(str));
    return 0;
}

Note that the above code still assumes that the letters form two contiguous blocks in the same order from a to z .请注意,上面的代码仍然假设字母从az以相同的顺序形成两个连续的块。 This assumption holds for the ASCII character set, which is almost universal today, but only partially so for the EBCDIC set still in use in some mainframe systems, where there is a constant offset between cases but the letters from a to z do not form a contiguous block.这个假设适用于ASCII字符集,它在今天几乎是通用的,但对于仍在某些大型机系统中使用的EBCDIC集仅部分如此,其中大小写之间存在恒定偏移,但从az的字母不形成 a连续块。

A more generic approach would use functions and macros from <ctype.h> to test for whitespace (space and other whitespace characters), character case and to convert case:更通用的方法是使用<ctype.h>中的函数和宏来测试空格(空格和其他空格字符)、字符大小写和转换大小写:

#include <ctype.h>

char *ft_strcapitalize(char *str) {
    for (size_t i = 0; str[i] != '\0'; i++) {
        if (i == 0 || isspace((unsigned char)str[i - 1]))
            str[i] = toupper((unsigned char)str[i]);
        else
            str[i] = tolower((unsigned char)str[i]);
    }
    return str;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM