简体   繁体   English

为什么 strcmp() 在其输入相等时返回 0?

[英]Why does strcmp() return 0 when its inputs are equal?

When I make a call to the C string compare function like this:当我像这样调用 C 字符串比较函数时:

strcmp("time","time")

It returns 0, which implies that the strings are not equal.它返回 0,这意味着字符串不相等。

Can anyone tell me why C implementations seem to do this?谁能告诉我为什么 C 实现似乎这样做? I would think it would return a non-zero value if equal.如果相等,我认为它会返回一个非零值。 I am curious to the reasons I am seeing this behavior.我很好奇我看到这种行为的原因。

strcmp returns a lexical difference (or should i call it "short-circuit serial byte comparator" ? :-) ) of the two strings you have given as parameters. strcmp 返回您作为参数给出的两个字符串的词法差异(或者我应该称其为“短路串行字节比较器”?:-))。 0 means that both strings are equal 0 表示两个字符串相等

A positive value means that s1 would be after s2 in a dictionary.正值意味着 s1 在字典中位于 s2 之后。

A negative value means that s1 would be before s2 in a dictionary.负值意味着在字典中 s1 将在 s2 之前。

Hence your non-zero value when comparing "time" and "money" which are obviously different, even though one would say that time is money !因此,在比较明显不同的“时间”和“金钱”时,您的非零值,即使有人会说时间就是金钱! :-) :-)

The nice thing about an implementation like this is you can say像这样的实现的好处是你可以说

if(strcmp(<stringA>, <stringB>) > 0)   // Implies stringA > stringB
if(strcmp(<stringA>, <stringB>) == 0)  // Implies stringA == stringB
if(strcmp(<stringA>, <stringB>) < 0)   // Implies stringA < stringB
if(strcmp(<stringA>, <stringB>) >= 0)  // Implies stringA >= stringB
if(strcmp(<stringA>, <stringB>) <= 0)  // Implies stringA <= stringB
if(strcmp(<stringA>, <stringB>) != 0)  // Implies stringA != stringB

Note how the comparison with 0 exactly matches the comparison in the implication.请注意与 0 的比较如何与含义中的比较完全匹配。

It's common to functions to return zero for the common - or one-of-a-kind - case and non-zero for special cases.函数对于普通或独一无二的情况返回零而在特殊情况下返回非零是很常见的。 Take the main function, which conventionally returns zero on success and some nonzero value for failure.以 main 函数为例,它通常在成功时返回零,失败时返回一些非零值。 The precise non-zero value indicates what went wrong.精确的非零值指示出了什么问题。 For example: out of memory, no access rights or something else.例如:内存不足、无访问权限或其他。

In your case, if the string is equal, then there is no reason why it is equal other than that the strings contain the same characters.在你的情况,如果字符串相等,则没有理由为什么它比字符串包含相同的字符等于其他。 But if they are non-equal then either the first can be smaller, or the second can be smaller.但是如果它们不相等,那么第一个可以更小,或者第二个可以更小。 Having it return 1 for equal, 0 for smaller and 2 for greater would be somehow strange i think.让它返回 1 表示相等,0 表示较小,2 表示较大,我认为会有些奇怪。

You can also think about it in terms of subtraction:你也可以从减法的角度考虑:

return = s1 - s2

If s1 is "lexicographically" less, then it will give is a negative value.如果 s1 是“按字典顺序”减少的,那么它会给出一个负值。

Another reason strcmp() returns the codes it does is so that it can be used directly in the standard library function qsort() , allowing you to sort an array of strings: strcmp()返回它所做的代码的另一个原因是它可以直接在标准库函数qsort() ,允许您对字符串数组进行排序:

#include <string.h> // for strcmp()
#include <stdlib.h> // for qsort()
#include <stdio.h>

int sort_func(const void *a, const void *b)
{
    const char **s1 = (const char **)a;
    const char **s2 = (const char **)b;
    return strcmp(*s1, *s2);
}

int main(int argc, char **argv)
{
    int i;
    printf("Pre-sort:\n");
    for(i = 1; i < argc; i++)
        printf("Argument %i is %s\n", i, argv[i]);
    qsort((void *)(argv + 1), argc - 1, sizeof(char *), sort_func);
    printf("Post-sort:\n");
    for(i = 1; i < argc; i++)
        printf("Argument %i is %s\n", i, argv[i]);
    return 0;
}

This little sample program sorts its arguments ASCIIbetically (what some would call lexically).这个小示例程序以 ASCII 方式(有些人称之为词法)​​对其参数进行排序。 Lookie:看:

$ gcc -o sort sort.c
$ ./sort hi there little fella
Pre-sort:
Argument 1 is hi
Argument 2 is there
Argument 3 is little
Argument 4 is fella
Post-sort:
Argument 1 is fella
Argument 2 is hi
Argument 3 is little
Argument 4 is there

If strcmp() returned 1 (true) for equal strings and 0 (false) for inequal ones, it would be impossible to use it to obtain the degree or direction of inequality (ie how different, and which one is bigger) between the two strings, thus making it impossible to use it as a sorting function.如果strcmp()对相等的字符串返回1 (真),对不等的字符串返回0 (假),就不可能用它来获得两者之间的不平等程度方向(即差异有多大,哪个更大)字符串,因此无法将其用作排序功能。

I don't know how familiar you are with C. The above code uses some of C's most confusing concepts - pointer arithmetic, pointer recasting, and function pointers - so if you don't understand some of that code, don't worry, you'll get there in time.我不知道你对 C 有多熟悉。上面的代码使用了 C 中一些最令人困惑的概念——指针运算、指针重铸和函数指针——所以如果你不理解其中的一些代码,不要担心,你会及时到达那里。 Until then, you'll have plenty of fun questions to ask on StackOverflow.在此之前,您将在 StackOverflow 上提出许多有趣的问题。 ;) ;)

You seem to want strcmp to work like a (hypothetical)您似乎希望strcmp像(假设的)一样工作

int isEqual(const char *, const char *)

To be sure that would be true to the "zero is false" interpretation of integer results, but it would complicate the logic of sorting because, having established that the two strings were not the same, you would still need to learn which came "earlier".为了确保整数结果的“零为假”解释是正确的,但这会使排序逻辑复杂化,因为在确定两个字符串不相同后,您仍然需要了解哪个“更早” ”。

Moreover, I suspect that a common implementation looks like此外,我怀疑一个常见的实现看起来像

int strcmp(const char *s1, const char *s2){
   const unsigned char *q1=s1, *q2=s2;
   while ((*q1 == *q2) && *q1){ 
      ++q1; ++q2; 
   };
   return (*q1 - *q2);
}

which is [ edit: kinda] elegant in a K&R kind of way.这是[编辑:有点]优雅的K&R方式。 The important point here (which is increasingly obscured by getting the code right (evidently I should have left well enough alone)) is the way the return statement:这里的重点(通过正确获取代码而越来越模糊(显然我应该单独留下))是 return 语句的方式:

   return (*q1 - *q2);

which gives the results of the comparison naturally in terms of the character values.这自然地根据字符值给出了比较的结果。

There's three possible results: string 1 comes before string 2, string 1 comes after string 2, string 1 is the same as string 2. It is important to keep these three results separate;有三种可能的结果:字符串 1 在字符串 2 之前,字符串 1 在字符串 2 之后,字符串 1 与字符串 2 相同。保持这三个结果分开很重要; one use of strcmp() is to sort strings. strcmp() 的一种用途是对字符串进行排序。 The question is how you want to assign values to these three outcomes, and how to keep things more or less consistent.问题是您希望如何为这三个结果分配值,以及如何使事情或多或少保持一致。 You might also look at the parameters for qsort() and bsearch(), which require compare functions much like strcmp().您还可以查看 qsort() 和 bsearch() 的参数,它们需要与 strcmp() 非常相似的比较函数。

If you wanted a string equality function, it would return nonzero for equal strings and zero for non-equal strings, to go along with C's rules on true and false.如果你想要一个字符串相等函数,它会为相等的字符串返回非零,为不相等的字符串返回零,以符合 C 对 true 和 false 的规则。 This means that there would be no way of distinguishing whether string 1 came before or after string 2. There are multiple true values for an int, or any other C data type you care to name, but only one false.这意味着无法区分字符串 1 是在字符串 2 之前还是之后。 int 或您想命名的任何其他 C 数据类型有多个真值,但只有一个假值。

Therefore, having a useful strcmp() that returned true for string equality would require a lot of changes to the rest of the language, which simply aren't going to happen.因此,有一个有用的 strcmp() 返回 true 字符串相等将需要对语言的其余部分进行大量更改,这根本不会发生。

我想这只是为了对称:-1 表示较少,0 表示相等,1 表示更多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM