简体   繁体   English

在C中返回值-1作为size_t返回值的后果是什么?

[英]What are the ramifications of returning the value -1 as a size_t return value in C?

I am reading a textbook and one of the examples does this. 我正在读一本教科书,其中一个例子就是这样做的。 Below, I've reproduced the example in abbreviated form: 下面,我以缩写形式复制了这个例子:

#include <stdio.h>
#define SIZE 100

size_t linearSearch(const int array[], int searchVal, size_t size);

int main(void)
{
    int myArray[SIZE];
    int mySearchVal;
    size_t returnValue;

    // populate array with data & prompt user for the search value

    // call linear search function
    returnValue = linearSearch(myArray, mySearchVal, SIZE);

    if (returnValue != -1)
        puts("Value Found");
    else
        puts("Value Not Found");
}

size_t linearSearch(const int array[], int key, size_t size)
{
    for (size_t i = 0; i < size; i++) {
        if (key == array[i])
            return i;
    }
    return -1;
}

Are there any potential problems with this? 这有什么潜在的问题吗? I know size_t is defined as an unsigned integral type so it seems as if this might be asking for trouble at some point if I'm returning -1 as a size_t return value. 我知道size_t被定义为无符号整数类型,所以如果我返回-1作为size_t返回值,似乎这可能会在某些时候遇到麻烦。

There's a few APIs that come to mind which use the maximum signed or unsigned integer value as a sentinel value. 我想到了一些使用最大有符号或无符号整数值作为标记值的API。 For example, C++'s std::string::find() method returns std::string::npos if the value given to find() could not be found within the string, and std::string::npos is equal to (std::string::size_type)-1 . 例如,如果在std::string::nposfind()find()的值,并且std::string::npos等于(std::string::size_type)-1 C ++,则std::string::find()方法返回std::string::npos (std::string::size_type)-1

Similarly, on iOS and OS X, NSArray 's indexOfObject: method return NSNotFound when the object cannot be found in the array. 同样,在iOS和OS X上, NSArrayindexOfObject:方法在数组中找不到对象时返回NSNotFound Surprisingly, NSNotFound is actually defined to NSIntegerMax , which is either INT_MAX for 32-bit platforms or LONG_MAX for 64-bit platforms, even though NSArray indexes are typically NSUInteger (which is either unsigned int for 32-bit platforms or unsigned long for 64-bit platforms). 令人惊讶的是, NSNotFound实际上定义为NSIntegerMax ,它可以是32位平台的INT_MAX ,也LONG_MAX是64位平台的LONG_MAX ,即使NSArray索引通常是NSUInteger (对于32位平台,它是unsigned int ,对于64位平台,是unsigned int unsigned long位平台)。

It does mean that there will be no distinction between “not found” and “element number 18,446,744,073,709,551,615” (for 64-bit systems), but whether that is an acceptable trade off is up to you. 它确实意味着“未找到”和“元素号18,446,744,073,709,551,615”(对于64位系统)之间没有区别,但这是否是可接受的权衡取决于您。

An alternative is to have the function return the index through a pointer argument and have the function's return value indicate success or failure, eg 另一种方法是使函数通过指针参数返回索引,并使函数的返回值指示成功或失败,例如

#include <stdbool.h>

bool linearSearch(const int array[], int val, size_t size, size_t *index)
{
    // find value and then

    if (found)
    {
        *index = indexOfFoundItem;
        return true;
    }
    else
    {
        *index = 0; // optional, in some cases, better to leave *index untouched
        return false;
    }
}

Your compiler may decide to complain about comparing signed with unsigned — GCC or Clang will if provoked * — but otherwise "it works". 您的编译器可能会决定投诉比较签名与未签名 - GCC或Clang将被激活* - 但否则“它工作”。 On two's-complement machines (most machines these days), (size_t)-1 is the same as SIZE_MAX — indeed, as discussed in extenso in the comments, it is the same for one's-complement or sign-magnitude machines because of the wording in §6.3.1.3 of the C99 and C11 standards). 在二进制补码机器(目前大多数机器)上, (size_t)-1SIZE_MAX相同 - 实际上,正如评论中的扩展中所讨论的那样,对于一些补码或符号幅度的机器,由于措辞的原因是相同的在C99和C11标准的§6.3.1.3中)。

Using (size_t)-1 to indicate 'not found' means that you can't distinguish between the last entry in the biggest possible array and 'not found', but that's seldom an actual problem. 使用(size_t)-1表示“未找到”意味着您无法区分最大可能数组中的最后一个条目和“未找到”,但这很少是实际问题。

So, it's just the one edge case where I could end up having a problem? 那么,这只是我可能最终遇到问题的一个边缘案例?

The array would have to be an array of char , though, to be big enough to cause trouble — and while you could have 4 GiB memory with a 32-bit machine, it's pretty implausible to have all that memory committed to a character array (and it's very much less likely to be an issue with 64-bit machines; most don't run to 16 exbibytes of memory). 然而,数组必须是一个char数组,足以引起麻烦 - 虽然你可以拥有一个32位机器的4 GiB内存,但将所有内存提交给字符数组是非常难以置信的(并且它不太可能成为64位机器的问题;大多数不会运行到16个exbibytes的内存)。 So it isn't a practical edge case. 所以这不是一个实际的边缘情况。

In POSIX, there is a ssize_t type, the signed type of the same size of size_t . 在POSIX中,有一个ssize_t类型, size_t大小相同的签名类型。 You could consider using that instead of size_t . 您可以考虑使用它而不是size_t However, it causes the same angst that (size_t)-1 causes, in my experience. 然而,根据我的经验,它会引起与(size_t)-1相同的焦虑。 Plus on a 32-bit machine, you could have a 3 GiB chunk of memory treated as an array of char , but with ssize_t as a return type, you couldn't usefully use more than 2 GiB — or you'd need to use SSIZE_MIN (if it existed; I'm not sure it does) instead of -1 as the signal value. 另外在一台32位机器上,你可以将一个3 GiB的内存块视为一个char数组,但是如果使用ssize_t作为返回类型,则无法使用超过2 GiB的内存 - 或者你需要使用SSIZE_MIN (如果它存在;我不确定它)而不是-1作为信号值。


* GCC or Clang has to be provoked fairly hard. * GCC或Clang必须相当努力。 Simply using -Wall is not sufficient; 仅仅使用-Wall是不够的; it takes -Wextra (or the specific -Wsign-compare option) to trigger a warning. 需要-Wextra (或特定的-Wsign-compare选项)来触发警告。 Since I routinely compile with -Wextra , I'm aware of the issue; 因为我经常用-Wextra编译,所以我知道这个问题; not everyone is as vigilant. 不是每个人都保持警惕。

Comparing signed and unsigned quantities is fully defined by the standard, but can lead to counter-intuitive results (because small negative numbers appear very large when converted to unsigned values), which is why the compilers complain if requested to do so. 比较有符号和无符号的数量完全由标准定义,但可能导致反直觉的结果(因为当转换为无符号值时,小的负数看起来非常大),这就是编译器在被请求的情况下抱怨的原因。

Normally if you want to return negative values and still have some notion of a size type you use ssize_t . 通常,如果您想要返回负值并且仍然有一些大小类型的概念,则使用ssize_t gcc and clang both complain but the following compiles. gcc和clang都抱怨,但以下编译。 Note, some of the following is undefined behavior... 注意,以下一些是未定义的行为......

#include <stdio.h>
#include <stdint.h>  

size_t foo() {
  return -1;
}

void print_bin(uint64_t num, size_t bytes);
void print_bin(uint64_t num, size_t bytes) {
  int i = 0;
  for(i = bytes * 8; i > 0; i--) {
    (i % 8 == 0) ? printf("|") : 1;
    (num & 1)    ? printf("1") : printf("0");
    num >>= 1;
  }
  printf("\n");
}

int main(void){  
   long int x = 0;
   printf("%zu\n", foo());
   printf("%ld\n", foo());
   printf("%zu\n", ~(x & 0)); 
   printf("%ld\n", ~(x & 0));

   print_bin((~(x & 0)), 8);

}

The output is 输出是

18446744073709551615
-1
18446744073709551615
-1
|11111111|11111111|11111111|11111111|11111111|11111111|11111111|11111111

I'm on a 64bit machine. 我在64位机器上。 The following in binary 以下是二进制

|11111111|11111111|11111111|11111111|11111111|11111111|11111111|11111111

can mean -1 or 18446744073709551615 , it depends on context ie in what way the type that has that binary representation is being used. 可以表示-118446744073709551615 ,它取决于上下文,即以何种方式使用具有该二进制表示的类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM