简体   繁体   English

返回包含数组的结构

[英]Returning struct containing array

The following simple code segfaults under gcc 4.4.4 以下是gcc 4.4.4下的简单代码段错误

#include<stdio.h>

typedef struct Foo Foo;
struct Foo {
    char f[25];
};

Foo foo(){
    Foo f = {"Hello, World!"};
    return f;
}

int main(){
    printf("%s\n", foo().f);
}

Changing the final line to 将最后一行更改为

 Foo f = foo(); printf("%s\n", f.f);

Works fine. 工作正常。 Both versions work when compiled with -std=c99 . 使用-std=c99编译时,这两个版本都可以工作。 Am I simply invoking undefined behavior, or has something in the standard changed, which permits the code to work under C99? 我是在简单地调用未定义的行为,还是在标准中进行了某些更改,从而使代码可以在C99下工作? Why does is crash under C89? 为什么在C89下崩溃?

I believe the behavior is undefined both in C89/C90 and in C99. 我相信C89 / C90和C99中的行为均未定义。

foo().f is an expression of array type, specifically char[25] . foo().f是数组类型的表达式,特别是char[25] C99 6.3.2.1p3 says: C99 6.3.2.1p3说:

Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type " is converted to an expression with type "pointer to type " that points to the initial element of the array object and is not an lvalue. 除非它是sizeof运算符或一元运算符的操作数,或者是用于初始化数组的字符串文字,否则将类型为“ array of type ”的表达式转换为类型为“ pointer to type ”的表达式,指向数组对象的初始元素,并且不是左值。 If the array object has register storage class, the behavior is undefined. 如果数组对象具有寄存器存储类,则该行为是不确定的。

The problem in this particular case (an array that's an element of a structure returned by a function) is that there is no "array object". 在这种特殊情况下(作为函数返回的结构元素的数组)的问题是没有“数组对象”。 Function results are returned by value, so the result of calling foo() is a value of type struct Foo , and foo().f is a value (not an lvalue) of type char[25] . 函数结果按值返回,因此调用foo()的结果是struct Foo类型的 ,而foo().fchar[25]类型的值(不是左值)。

This is, as far as I know, the only case in C (up to C99) where you can have a non-lvalue expression of array type. 据我所知,这是C语言(最多C99)中唯一可以使用数组类型的非左值表达式的情况。 I'd say that the behavior of attempting to access it is undefined by omission, likely because the authors of the standard (understandably IMHO) didn't think of this case. 我想说的是,尝试访问它的行为并没有被遗漏所定义,这可能是因为该标准的作者(可以理解的恕我直言)没有想到这种情况。 You're likely to see different behaviors at different optimization settings. 在不同的优化设置下,您可能会看到不同的行为。

The new 2011 C standard patches this corner case by inventing a new storage class. 新的2011 C标准通过发明新的存储类来修补这种情况。 N1570 (the link is to a late pre-C11 draft) says in 6.2.4p8: N1570 (链接到最新的C11草案)在6.2.4p8中说:

A non-lvalue expression with structure or union type, where the structure or union contains a member with array type (including, recursively, members of all contained structures and unions) refers to an object with automatic storage duration and temporary lifetime . 具有结构或联合类型的非左值表达式,其中结构或联合包含具有数组类型的成员(递归包括所有包含的结构和联合的成员)是指具有自动存储期限和临时生存期的对象。 Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. 它的生命周期从对表达式进行求值开始,并且其初始值为表达式的值。 Its lifetime ends when the evaluation of the containing full expression or full declarator ends. 当包含完整表达式或完整声明符的求值结束时,其生存期结束。 Any attempt to modify an object with temporary lifetime results in undefined behavior. 任何试图使用临时生存期修改对象的尝试都会导致未定义的行为。

So the program's behavior is well defined in C11. 因此,程序的行为在C11中得到了很好的定义。 Until you're able to get a C11-conforming compiler, though, your best bet is probably to store the result of the function in a local object (assuming your goal is working code rather than breaking compilers): 但是,在能够获得符合C11的编译器之前,最好的选择可能是将函数的结果存储在本地对象中(假设您的目标是工作代码而不是破坏编译器):

[...]
int main(void ) {
    struct Foo temp = foo();
    printf("%s\n", temp.f);
}

printf is a bit funny, because it's one of those functions that takes varargs . printf有点有趣,因为它是使用varargs的那些函数之一。 So let's break it down by writing a helper function bar . 因此,让我们通过编写辅助功能bar将其分解。 We'll return to printf later. 稍后我们将返回至printf

(I'm using "gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3") (我正在使用“ gcc(Ubuntu 4.4.3-4ubuntu5)4.4.3”)

void bar(const char *t) {
    printf("bar: %s\n", t);
}

and calling that instead: 然后调用它:

bar(foo().f); // error: invalid use of non-lvalue array

OK, that gives an error. 好的,这会导致错误。 In C and C++, you are not allowed to pass an array by value . 在C和C ++中,不允许通过value传递数组。 You can work around this limitation by putting the array inside a struct, for example void bar2(Foo f) {...} 您可以通过将数组放入结构中来解决此限制,例如void bar2(Foo f) {...}

But we're not using that workaround - we're not allowed to pass in the array by value. 但是我们没有使用该解决方法-我们不允许按值传递数组。 Now, you might think it should decay to a char* , allowing you to pass the array by reference. 现在,您可能认为它应该衰减为char* ,从而允许您通过引用传递数组。 But decay only works if the array has an address (ie is an lvalue). 但是衰减仅在数组具有地址(即左值)的情况下有效。 But temporaries , such as the return values from function, live in a magic land where they don't have an address. 但是临时变量(例如,函数的返回值)生活在没有地址的神奇土地上。 Therefore you can't take the address & of a temporary. 因此,您不能使用临时地址&地址。 In short, we're not allowed to take the address of a temporary, and hence it can't decay to a pointer. 简而言之,我们不允许使用临时地址,因此它不能衰减到指针。 We are unable to pass it by value (because it's an array), nor by reference (because it's a temporary). 我们无法通过值(因为它是一个数组)或引用(因为它是临时的)来传递它。

I found that the following code worked: 我发现以下代码有效:

bar(&(foo().f[0]));

but to be honest I think that's suspect. 但说实话,我认为那是可疑的。 Hasn't this broken the rules I just listed? 这是否违反了我刚刚列出的规则?

And just to be complete, this works perfectly as it should: 只是为了完整起见,这完全可以正常工作:

Foo f = foo();
bar(f.f);

The variable f is not a temporary and hence we can (implicitly, during decay) takes its address. 变量f不是临时变量,因此我们可以(隐式地在衰减期间)获取其地址。

printf, 32-bit versus 64-bit, and weirdness printf,32位和64位以及怪异现象

I promised to mention printf again. 我答应再次提及printf According to the above, it should refuse to pass foo().f to any function (including printf). 根据以上所述,它应该拒绝将foo()。f传递给任何函数(包括printf)。 But printf is funny because it's one of those vararg functions. 但是printf很有趣,因为它是这些vararg函数之一。 gcc allowed itself to pass the array by value to the printf. gcc允许自己通过值将数组传递给printf。

When I first compiled and ran the code, it was in 64-bit mode. 当我第一次编译并运行代码时,它处于64位模式。 I didn't see confirmation of my theory until I compiled in 32-bit ( -m32 to gcc). 在以32位(从-m32到gcc)进行编译之前,我看不到理论的证实。 Sure enough I got a segfault, as in the original question. 像最初的问题一样,我确实遇到了段错误。 (I had been getting some gibberish output, but no segfault, when in 64 bits). (使用64位时,我一直得到一些混乱的输出,但没有段错误)。

I implemented my own my_printf (with the vararg nonsense) which printed the actual value of the char * before trying to print the letters pointed at by the char* . 我实现了自己的my_printf (使用vararg废话),在尝试打印char*指向的字母之前,它打印了char *的实际值。 I called it like so: 我这样称呼它:

my_printf("%s\n", f.f);
my_printf("%s\n", foo().f);

and this is the output I got ( code on ideone ): 这是我得到的输出( ideone上的代码 ):

arg = 0xffc14eb3        // my_printf("%s\n", f.f); // worked fine
string = Hello, World!
arg = 0x6c6c6548        // my_printf("%s\n", foo().f); // it's about to crash!
Segmentation fault

The first pointer value 0xffc14eb3 is correct (it points to the characters "Hello, world!"), but look at the second 0x6c6c6548 . 第一个指针值0xffc14eb3是正确的(它指向字符“ Hello,world!”),但是请看第二个指针值0x6c6c6548 That's the ASCII codes for Hell (reverse order - little endianness or something like that). 那是Hell的ASCII码(反序-小端序或类似的东西)。 It has copied the array by value into printf and the first four bytes have been interpreted as a 32-bit pointer or integer. 它已按值将数组复制到printf中,并且前四个字节已解释为32位指针或整数。 This pointer doesn't point anywhere sensible and hence the program crashes when it attempts to access that location. 该指针没有指向任何明智的位置,因此,当它尝试访问该位置时,程序将崩溃。

I think this is in violation of the standard, simply by virtue of the fact that we're not supposed to be allowed to copy arrays by value. 我认为这是违反标准的,仅仅是因为我们不允许我们按值复制数组。

On MacOS X 10.7.2, both GCC/LLVM 4.2.1 ('i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)') and GCC 4.6.1 (which I built) compile the code without warnings (under -Wall -Wextra ), in both 32-bit and 64-bit modes. 在MacOS X 10.7.2上,两个GCC / LLVM 4.2.1('i686-apple-darwin11-llvm-gcc-4.2(GCC)4.2.1(基于Apple Inc.内部版本5658)(LLVM内部版本2335.15.00)' )和GCC 4.6.1(我构建的)在32位和64位模式下均在没有警告的情况下(在-Wall -Wextra下)编译代码。 The programs all run without crashing. 程序全部运行而不会崩溃。 This is what I'd expect; 这就是我所期望的; the code looks fine to me. 代码对我来说看起来不错。

Maybe the problem on Ubuntu is a bug in the specific version of GCC that has since been fixed? 也许Ubuntu上的问题是特定版本的GCC中的错误,此错误已得到修复?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM