Memory waste? If main() should only return 0 or 1, why is main declared with int and not short int or even char?

Question

For example:

#include <stdio.h> 
int main (void)                         /* Why int and not short int? - Waste of Memory */ 
{
     printf("Hello World!");
     return 0; 
}

Why main() is conventional defined with int type, which allocates 4 bytes in memory on 32-bit, if it usually returns only 0 or 1, while other types such as short int (2 bytes,32-bit) or even char (1 byte,32-bit) would be more memory saving?

It is wasting memory space.

NOTE: The question is not a duplicate of the thread given; its answers only correspond to the return value itself but not its datatype at explicit focus.

The Question is for C and C++. If the answers between those alter, share your wisdom with the mention of the context of which language in particular is focused.

Answer 1

Usually assemblers use their registers to return a value (for example the register AX in Intel processors). The type int corresponds to the machine word That is, it is not required to convert, for example, a byte that corresponds to the type char to the machine word.

And in fact, main can return any integer value.

Answer 2

It's because of a machine that's half a century old.

Back in the day when C was created, an int was a machine word on the PDP-11 - sixteen bits - and it was natural and efficient to have main return that.

The "machine word" was the only type in the B language, which Ritchie and Thompson had developed earlier, and which C grew out of.
When C added types, not specifying one gave you a machine word - an int .
(It was very important at the time to save space, so not requiring the most common type to be spelled out was a Very Good Thing.)

So, since a B program started with

main()

and programmers are generally language-conservative, C did the same and returned an int .

Answer 3

There are two reasons I would not consider this a waste:

1 practical use of 4 byte exit code

If you want to return an exit code that exactly describes an error you want more than 8 bit.

As an example you may want to group errors: the first byte could describe the vague type of error, the second byte could describe the function that caused the error, the third byte could give information about the cause of the error and the fourth byte describes additional debug information.

2 Padding

If you pass a single short or char they will still be aligned to fit into a machine word, which is often 4 Byte/32 bit depending on architecture. This is called padding and means, that you will most likely still need 32 bit of memory to return a single short or char.

Answer 4

The old-fashioned convention with most shells is to use the least significant 8 bits of int , not just 0 or 1. 16 bits is increasingly common due to that being the minimum size of an int allowed by the standard.

And what would the issue be with wasting space? Is the space really wasted? Is your computer so full of "stuff" that the remaining sizeof(int) * CHAR_BIT - 8 would make a difference? Could the architecture exploit that and use those remaining bits for something else? I very much doubt it.

So I wouldn't say the memory is at all wasted since you get it back from the operating system when the program finishes. Perhaps extravagent ? A bit like using a large wine glass for a small tipple perhaps?

Answer 5

1st: Alone your assumption/statement if it usually returns only 0 or 1 is wrong.

Usually the return code is expected to be 0 if no error occurred but otherwise it can return any number to represent different errors. And most (at least command line programs) do so. Many programs also output negative numbers.

However there are a few common used codes https://www.tldp.org/LDP/abs/html/exitcodes.html also here another SO member points to a unix header that contains some codes https://stackoverflow.com/a/24121322/2331592

So after all it is not just a C or C++ type thing but also has historical reasons how most operating systems work and expect the programs to behave and since that the languages have to support that and so at least C like languages do that by using an int main(...) .

2nd: your conclusion It is wasting memory space is wrong.

Using an int in comparison to a shorter type does not involve any waste. Memory is usually handled in word-size (that that mean may depend from your architecture) anyway
working with sub-word-types involves computation overheand on some architecture (read: load, word, mask out unrelated bits; store: load memory, mask out variable bits, or them with the new value, write the word back)
the memory is not wasted unless you use it. if you write return 0; no memory is ever used at this point. if you return myMemorySaving8bitVar; you only have 1 byte used (most probable on the stack (if not optimized out at all))

Answer 6

The answer is "because it usually doesn't return only 0 or 1." I found this thread from software engineering community that at least partially answers your question. Here are the two highlights, first from the accepted answer:

An integer gives more room than a byte for reporting the error. It can be enumerated (return of 1 means XYZ, return of 2 means ABC, return of 3, means DEF, etc..) or used as flags ( 0x0001 means this failed, 0x0002 means that failed, 0x0003 means both this and that failed). Limiting this to just a byte could easily run out of flags (only 8), so the decision was probably to use an integer.

An interesting point is also raised by Keith Thompson :

For example, in the dialect of C used in the Plan 9 operating system main is normally declared as a void function, but the exit status is returned to the calling environment by passing a string pointer to the exits() function. The empty string denotes success, and any non-empty string denotes some kind of failure. This could have been implemented by having main return a char* result.

Here's another interesting bit from a unix.com forum :

(Some of the following may be x86 specific.)

Returning to the original question: Where is the exit status stored? Inside the kernel.

When you call exit(n), the least significant 8 bits of the integer n are written to a cpu register. The kernel system call implementation will then copy it to a process-related data structure.

What if your code doesn't call exit()? The c runtime library responsible for invoking main() will call exit() (or some variant thereof) on your behalf. The return value of main(), which is passed to the c runtime in a register, is used as the argument to the exit() call.

Related to the last quote, here's another from cppreference.com

5) Execution of the return (or the implicit return upon reaching the end of main) is equivalent to first leaving the function normally (which destroys the objects with automatic storage duration) and then calling std::exit with the same argument as the argument of the return. (std::exit then destroys static objects and terminates the program)

Lastly, I found this really cool example here (although the author of the post is wrong in saying that the result returned is the returned value modulo 512). After compiling and executing the following:

int main() {
    return 42001;
}

on ~~a POSIX compliant~~ my* system, echo $? returns 17. That is because 42001 % 256 == 17 which shows that 8 bits of data are actually used. With that in mind, choosing int ensures that enough storage is available for passing the program's exit status information, because, as per this answer , compliance to the C++ standard guarantees that size of int (in bits)

can't be less than 8. That's because it must be large enough to hold "the eight-bit code units of the Unicode UTF-8 encoding form."

EDIT:

*As Andrew Henle pointed out in the comment:

A fully POSIX compliant system makes the entire int return value available, not just 8 bits. See pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html : "If si_code is equal to CLD_EXITED , then si_status holds the exit value of the process; otherwise, it is equal to the signal that caused the process to change state. The exit value in si_status shall be equal to the full exit value (that is, the value passed to _exit() , _Exit() , or exit() , or returned from main() ); it shall not be limited to the least significant eight bits of the value."

I think this makes for an even stronger argument for the use of int over data types of smaller sizes.

Answer 7

You're either working in or learning C, so I think it's a Real Good Idea that you are concerned with efficiency. However, it seems that there are a few things that seem to need clarifying here.

First, the int data type is not an never was intended to mean "32 bits". The idea was that int would be the most natural binary integer type on the target machine--usually the size of a register.

Second, the return value from main() is meant to accommodate a wide range of implementations on different operating systems. A POSIX system uses an unsigned 8-bit return code. Windows uses 32-bits that are interpreted by the CMD shell as 2's complement signed. Another OS might choose something else.

And finally, if you're worried about memory "waste", that's an implementation issue that isn't even an issue in this case. Return codes from main are typically returned in machine registers, not in memory, so there is no cost or savings involved. Even if there were, saving 2 bytes in the run of a nontrivial program is not worth any developer's time.

Memory waste? If main() should only return 0 or 1, why is main declared with int and not short int or even char?

Question

7 answers

solution1
9 2019-10-11 09:00:05

solution2
6 2019-10-11 09:41:13

solution3
5 2019-10-11 09:11:39

1 practical use of 4 byte exit code

2 Padding

solution4
3 2019-10-11 09:11:28

solution5
2 2019-10-11 10:11:21

solution6
1 2019-10-11 12:09:28

solution7
1 2019-10-11 13:03:42

Memory waste? If main() should only return 0 or 1, why is main declared with int and not short int or even char?

Question

7 answers

solution1 9 2019-10-11 09:00:05

solution2 6 2019-10-11 09:41:13

solution3 5 2019-10-11 09:11:39

1 practical use of 4 byte exit code

2 Padding

solution4 3 2019-10-11 09:11:28

solution5 2 2019-10-11 10:11:21

solution6 1 2019-10-11 12:09:28

solution7 1 2019-10-11 13:03:42

solution1
9 2019-10-11 09:00:05

solution2
6 2019-10-11 09:41:13

solution3
5 2019-10-11 09:11:39

solution4
3 2019-10-11 09:11:28

solution5
2 2019-10-11 10:11:21

solution6
1 2019-10-11 12:09:28

solution7
1 2019-10-11 13:03:42