简体   繁体   English

比较 strcmp 以外的两个字符串的正确方法

[英]proper way of comparing two strings other than strcmp

I was reading the proper way of comparing two strings is to use the strcmp() function.我正在阅读比较两个字符串的正确方法是使用 strcmp() function。

However I am not sure why the first comparison below would leads to the condition True (ie the contents inside the first "if condition" was able to be printed).但是我不确定为什么下面的第一个比较会导致条件为真(即第一个“如果条件”中的内容能够被打印)。

#include <stdio.h>
#include <string.h>
int main()
{
    printf("Hello World\n");
    
    char *x;
    char *w = "oliver bier";
    char *w2 = "oliver bier";
    x = w;
    
    if(x == "oliver bier") { // what is being compared here?, why this leads to True
        printf("This is comparing string also????\n");
        printf("Or this is comparing the type of the variable x and type of oliver bier i.e. a string??");
    }
    
    if ( strcmp( x, w2) == 0 ) { // I was reading this is the correct way of comparing strings
        printf("I think this is the correct way to compare two strings\n");
    }
    
    if(x == w) { // what is being compared here?
        printf("This is expected!! since x, w stored same address");
    }
    


    return 0;
}


so basically why is it that if(x == "oliver bier") would evaluate to true as well?所以基本上为什么 if(x == "oliver bier") 也会评估为真? I thought x is a pointer to character, and "oliver bier" is a string.我认为 x 是一个指向字符的指针,而“oliver bier”是一个字符串。

Literal strings in C are really arrays of characters, whose life-time last the entire run-time of the program. C 中的文字字符串实际上是字符的 arrays,其生命周期持续到程序的整个运行时间。

When you get a pointer to a literal string, you get a pointer to its first character (it's normal array-to-pointer decay).当您获得指向文字字符串的指针时,您将获得指向其第一个字符的指针(这是正常的数组到指针衰减)。

Also, the C specification allows compilers to reuse literals.此外,C 规范允许编译器重用文字。 So for example all instances of "oliver bier" can (and most likely will) be the exact same array, and the pointers to its first character will then of course also be the same.因此,例如"oliver bier"的所有实例都可以(并且很可能会)是完全相同的数组,那么指向其第一个字符的指针当然也将是相同的。

That's the reason that the comparison x == "oliver bier" will work.这就是比较x == "oliver bier"会起作用的原因。

But if you change it to:但是,如果您将其更改为:

char x[] = "oliver bier";

Then the comparison will no longer work, as the pointer to the array x and the pointer to the array "oliver bier" will be different.然后比较将不再起作用,因为指向数组x的指针和指向数组"oliver bier"的指针将不同。

Most of the time, calling a function like strcmp is the proper and the only way of reliably comparing two strings.大多数时候,像strcmp这样调用 function 是可靠地比较两个字符串的正确且唯一的方法。 Most of the time, checking pointer equality is not a reliable way.大多数时候,检查指针相等性不是一种可靠的方法。

The problem is that two different pointers can point to two different memory regions that contain separate copies of the same string.问题是两个不同的指针可以指向两个不同的 memory 区域,这些区域包含相同字符串的不同副本。 You can have this:你可以有这个:

    +---+        +---+---+---+---+---+---+---+---+---+---+---+---+
p1: | *--------> | o | l | i | v | e | r |   | b | i | e | r |\0 |
    +---+        +---+---+---+---+---+---+---+---+---+---+---+---+

    +---+        +---+---+---+---+---+---+---+---+---+---+---+---+
p2: | *--------> | o | l | i | v | e | r |   | b | i | e | r |\0 |
    +---+        +---+---+---+---+---+---+---+---+---+---+---+---+

Or you can have this:或者你可以有这个:

    +---+        +---+---+---+---+---+---+---+---+---+---+---+---+
p3: | *--------> | o | l | i | v | e | r |   | b | i | e | r |\0 |
    +---+        +---+---+---+---+---+---+---+---+---+---+---+---+
                   ^
    +---+          |
p4: | *------------'
    +---+

p1 and p2 point to different strings, so the pointers will compare unequal. p1p2指向不同的字符串,因此指针将比较不相等。 p3 and p4 point to the same string, so the pointers will compare equal. p3p4指向同一个字符串,所以指针比较相等。

If the pointers compare equal, obviously strcmp will say the strings are equal, too.如果指针比较相等,显然strcmp也会说字符串相等。 But if the pointers are different, the strings might be the same (as in p1 and p2 ), or they might be different.但如果指针不同,则字符串可能相同(如在p1p2中),或者它们可能不同。

Sometimes people write things like有时人们会写类似的东西

if(str1 == str2 || strcmp(str1, str2) == 0)

This checks to see whether the two strings str1 and str2 are the same.这将检查两个字符串str1str2是否相同。 If the pointers are equal, then the strings are the same, and only if the pointers are not equal does the code perform the (more expensive) call to strcmp to check the actual, pointed-to characters.如果指针相等,则字符串相同,并且仅当指针不相等时,代码才会执行(更昂贵的)对strcmp的调用以检查实际指向的字符。

When you have two string literals in your program that happen to be the same, like当您的程序中有两个恰好相同的字符串文字时,例如

char *w = "oliver bier";
char *w2 = "oliver bier";

or或者

char *w = "oliver bier";
...
if(w == "oliver bier") { ... }

you can't predict, in general, whether their pointers will be the same or different, whether the compiler was clever enough to have one in-memory copy of the string do double duty for its use in multiple places.通常,您无法预测它们的指针是相同还是不同,编译器是否足够聪明,可以让字符串的一个内存副本在多个地方使用它来执行双重任务。 Once upon a time this "cleverness" was quite rare, although I gather that today it's pretty common.曾几何时,这种“聪明”非常罕见,尽管我认为今天它很常见。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM