简体   繁体   English

为什么fprintf导致内存泄漏并且在缺少width参数时表现不可预测

[英]Why is fprintf causing memory leak and behaving unpredictably when width argument is missing

The following simple program is behaving unpredictably. 以下简单程序表现不可预测。 Sometimes it prints "0.00000", sometimes it prints more "0" than I can count. 有时打印“0.00000”,有时打印的“0”比我可以计算的更多。 Some times it uses up all memory on the system, before the system either kills some process, or it fails with bad_alloc. 有时它会占用系统上的所有内存,在系统杀死某个进程之前,或者使用bad_alloc失败。

#include "stdio.h"

int main() {
  fprintf(stdout, "%.*f", 0.0);
}

I'm aware that this is incorrect usage of fprintf. 我知道这是对fprintf的错误使用。 There should be another argument specifying the width of the formatting. 应该有另一个参数指定格式的宽度。 It's just surprising that the behavior is so unpredictable. 令人惊讶的是,这种行为是如此不可预测。 Sometimes it seems to use a default width, while sometimes it fails very badly. 有时它似乎使用默认宽度,而有时它失败非常严重。 Could this not be made to always fail or always use some default behaviour? 这可能不会总是失败或总是使用一些默认行为?

I came over similar usage in some code at work, and spent a lot of time figuring out what was happening. 我在工作中的某些代码中遇到了类似的用法,并花了很多时间搞清楚发生了什么。 It only seemed to happen with debug builds, but would not happen while debugging with gdb. 它似乎只发生在调试版本中,但在使用gdb进行调试时不会发生。 Another curiosity is that running it through valgrind would consistently bring about the printing of many "0"s case, which otherwise happens quite seldom, but the memory usage issue would never occur then either. 另一个好奇心是,通过valgrind运行它将始终带来许多“0”的情况,否则很少发生,但内存使用问题也不会发生。

I am running Red Hat Enterprise Linux 7, and compiled with gcc 4.8.5. 我正在运行Red Hat Enterprise Linux 7,并使用gcc 4.8.5进行编译。

Formally this is undefined behavior. 在形式上,这是未定义的行为。

As for what you're observing in practice: 至于你在实践中观察到的东西:
My guess is that fprintf ends up using an uninitialized integer as the number of decimal places to output. 我的猜测是fprintf最终使用未初始化的整数作为要输出的小数位数。 That's because it'll try to read a number from a location where the caller didn't write any particular value, so you'll just get whatever bits happen to be stored there. 那是因为它会尝试从调用者没有写任何特定值的位置读取一个数字,所以你只需要获得存储在那里的任何位。 If that happens to be a huge number, fprintf will try to allocate a lot of memory to store the result string internally. 如果这恰好是一个巨大的数字, fprintf将尝试分配大量内存来在内部存储结果字符串。 That would explain the "running out of memory" part. 这可以解释“耗尽内存”的一部分。

If the uninitialized value isn't quite that big, the allocation will succeed and you'll end up with a lot of zeroes. 如果未初始化的值不是那么大,分配将成功,你最终会得到很多零。

And finally, if the random integer value happens to be just 5 , you'll get 0.00000 . 最后,如果随机整数值恰好是5 ,那么你将获得0.00000

Valgrind probably consistently initializes the memory your program sees, so the behavior becomes deterministic. Valgrind可能会一直初始化程序看到的内存,因此行为变得具有确定性。

Could this not be made to always fail 难道这不会永远失败

I'm pretty sure it won't even compile if you use gcc -pedantic -Wall -Wextra -Werror . 如果你使用gcc -pedantic -Wall -Wextra -Werror我很确定它甚至不会编译。

Undefined behaviour is undefined. 未定义的行为未定义。

However, on x86-64 System-V ABI it is well-known that arguments are not passed on stack but in registers. 但是,在x86-64 System-V ABI上, 众所周知 ,参数不是在堆栈上传递,而是在寄存器中传递 Floating point variables are passed in floating-point registers , and integers are passed in general-purpose registers . 浮点变量在浮点寄存器中传递,整数在通用寄存器中传递。 There is no parameter store on stack, so the width of the arguments does not matter. 堆栈上没有参数存储,因此参数的宽度无关紧要。 Since you never passed any integer in the variable argument part, the general purpose register corresponding to the first argument will contain whatever garbage it had from before. 由于您从未在变量参数部分中传递任何整数 ,因此与第一个参数对应的通用寄存器将包含之前的任何垃圾。

This program will show how the floating point values and integers are passed separately: 该程序将显示浮点值和整数如何单独传递:

#include <stdio.h>

int main() {
    fprintf(stdout, "%.*f\n", 42, 0.0);
    fprintf(stdout, "%.*f\n", 0.0, 42);
}

Compiled on x86-64, GCC + Glibc, both printf s will produce the same output : 在x86-64,GCC + Glibc上编译,两个printf都会产生相同的输出

0.000000000000000000000000000000000000000000
0.000000000000000000000000000000000000000000

The format string does not match the parameters, therefore the bahaviour of fprintf is undefined. 格式字符串与参数不匹配,因此fprintf的行为未定义。 Google "undefined behaviour C" for more information about "undefined bahaviour". 谷歌“未定义的行为C”有关“未定义的bahaviour”的更多信息。

This would be correct: 这是正确的:

// printf 0.0 with 7 decimals
fprintf(stdout, "%.*f", 7, 0.0);

Or maybe you just want this: 或许你只是想要这个:

// printf 0.0 with de default format
fprintf(stdout, "%f", 0.0);

About this part of your question: Sometimes it seems to use a default width, while sometimes it fails very badly. 关于你的问题的这一部分: 有时它似乎使用默认宽度,而有时它失败非常严重。 Could this not be made to always fail or always use some default behaviour? 这可能不会总是失败或总是使用一些默认行为?

There cannot be any default behaviour, fprintf is reading the arguments according to the format string. 不能有任何默认行为, fprintf正在根据格式字符串读取参数。 If the arguments don't match, fprintf ends up with seamingly random values. 如果参数不匹配,则fprintf以seamingly随机值结束。


About this part of your question: Another curiosity is that running it through valgrind would consistently bring about the printing of many "0"s case, which otherwise happens quite seldom, but the memory usage issue would never occur then either. 关于你的问题的这一部分: 另一个好奇心是,通过valgrind运行它会一直带来许多“0”的情况的打印,否则很少发生,但内存使用问题也不会发生。 :

This is just another manifestation of undefined behaviour, with valgrind the conditions are quite different and therefore the actual undefined bahaviour can be different. 这只是未定义行为的另一种表现形式,valgrind条件差别很大,因此实际的未定义行为可能不同。

This is undefined behaviour in the standard. 这是标准中未定义的行为。 It means "anything is fair game" because you're doing wrong things. 这意味着“任何东西都是公平的游戏”因为你做错了事。

The worst part is that most certainly any compiler will warn you, but you have ignored the warning. 最糟糕的是,最肯定的是任何编译器都会警告你,但是你忽略了警告。 Putting some kind of validation other than the compiler will incurr in a cost that everybody will pay just so you can do what's wrong. 除了编译器之外进行某种验证会产生一个每个人都会支付的成本,这样你就可以做错了。

That's the opposite of what C and C++ stand for: you pay for what you use. 这与C和C ++所代表的相反:您为所使用的内容付费。 If you want to pay the cost, it's up to you to do the checking. 如果您想支付费用,由您来检查。

What's really happening depends on the ABI, compiler and architecture. 真正发生的事情取决于ABI,编译器和架构。 It's undefined behaviour because the language gives the implementer the freedom to do what's better on every machine (meaning, sometimes faster code, sometimes shorter code). 它是未定义的行为,因为该语言使实现者可以自由地在每台机器上做得更好(意思是,有时代码更快,代码更短)。

As an example, when you call a function on the machine, it just means that you're instructing the microprocessor to go to a certain code location. 例如,当您在机器上调用某个函数时,它只是意味着您指示微处理器转到某个代码位置。

In some made up assembly and ABI, then, printf("%.*f", 5, 1); 在一些组装组件和ABI中,则printf("%.*f", 5, 1); will translate into something like 将翻译成类似的东西

mov A, STR_F ; // load into register A the 32 bit address of the string "%.*f"
mov B, 5 ; // load second 32 bit parameter into B 
mov F0, 1.0 ; // load first floating point parameter into register F0
call printf ; // call the function

Now, if you miss some parameter, in this case B, it will take any value that was there before. 现在,如果你错过了一些参数,在这种情况下是B,它将采用之前的任何值。

The thing with functions like printf is that they allow anything in their parameter list (it's printf(const char*, ...) , so anything is valid). printf这样的printf是它们允许参数列表中的任何东西 (它是printf(const char*, ...) ,所以任何东西都是有效的)。 That's why you shouldn't use printf on C++: you have better alternatives, like streams. 这就是你不应该在C ++上使用printf的原因:你有更好的选择,比如流。 printf avoids the checkings of the compiler. printf避免了编译器的检查。 streams are better aware of types and are extensible to your own types. 流更好地了解类型,并且可以扩展到您自己的类型。 Also, that's why your code should compile without warnings. 此外,这就是为什么你的代码应该编译而没有警告。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM