Preprocessing Tokens: '- -' vs. '--'

Question

Why does the (GCC) preprocessor create two tokens - -B instead of a single one --B in the following example? What is the logic that the former should be correct and not the latter?

#define A -B
-A

Output according to gcc -E :

- -B

After all, -- is a valid operator, so theoretically a valid token as well.

Is this specific to the GCC preprocessor or does this follow from the C standards?

Answer 1

The preprocessor works on tokens, not strings. Macro substitution without ## cannot create a new token and so, if the preprocessor output goes to a textfile as opposed to going straight into the compiler, preprocessors insert whitespace so that the outputted textfile can be used as C input again without changed semantics.

The space insertion doesn't seem to be in the standard, but then the standard describes the preprocessor as working on tokens and as feeding its output to the compiler proper, not a textfile.

Answer 2

Focusing on the white space insertion is missing the issue.

The macro A is defined as the sequence of preprocessing tokens - and B .

When the compiler parses a fragment of source code -A , it produces 2 tokens - and A . A is expanded as part of the preprocessing phase and the tokens are converted to C tokens: - , - and B .

If B is itself defined as a macro ( #define B 4 ), A would expand to - , - , 4 , which is parsed as an expression evaluating to the value 4 with type int .

gcc -E produces text. For the text to convert back to the same sequence of tokens as the original source code, a space needs to be inserted between the two - tokens to prevent -- to be parsed as a single token.

Preprocessing Tokens: '- -' vs. '--'

Question

2 answers

solution1
3 2018-05-30 13:31:34

solution2
1 ACCPTED 2018-05-30 16:49:52

Preprocessing Tokens: '- -' vs. '--'

Question

2 answers

solution1 3 2018-05-30 13:31:34

solution2 1 ACCPTED 2018-05-30 16:49:52

solution1
3 2018-05-30 13:31:34

solution2
1 ACCPTED 2018-05-30 16:49:52