简体   繁体   中英

why is `int test {}` a function definition in C language BNF

I'm interested in the famous The syntax of C in Backus-Naur Form and studied for a while, what confuse me is that some syntax looks wrong to me but is considered right according to the BNF.

For example, int test {} , what's this? I think this is a ill syntax in C, but the truth is the BNF considered this a function definition:

int -> type_const -> type_spec -> decl_specs
test-> id -> direct_declarator -> declarator
'{' '}' -> compound_stat
decl_specs declarator compound_stat -> function_definition

I tried this with bison, it considered the input int test {} is a right form, but I tried this on a C compiler, it will not compile.

So got questions:

  1. int test {} a right syntax or not?
  2. If it is a right syntax, what is that mean and why compiler do not recognized it?
  3. If it is an ill syntax, can I say the BNF is not rigorous? And does that mean modern C compiler does not stick with this BNF?

The grammar is necessary but not sufficient to describe a valid C program. For that you need constraints from the standard too. A simpler example of this would be 0++ , which follows the syntax of a C expression, but certainly isn't a valid program fragment...

C11 6.9.1p2 :

  1. The identifier declared in a function definition (which is the name of the function) shall have a function type, as specified by the declarator portion of the function definition. [162]

The footnote 162 explains that the intent of the constraint is that a typedef cannot be used , ie that

typedef int F(void);
F f { /* ... */ }

will not be valid, even though such a typedef could be used for a function declaration , ie

F f;

would declare the function

int f(void);

But mere existence of this constraint also proves that the BNF grammar in itself is not sufficient in this case. Hence you are correct in that the grammar would consider such a fragment a function definition.

The BNF form is a precise way to describe the syntax of a language, ie what to do precisely to get the parse tree starting from raw input.

For each language you can define infinite many grammars that describe that language. The properties of these grammars that describe the same language can differ a lot.

If you study the grammar of the C language, take care as it is not context free but context sensitive, which means, the decision of choosing a rule or other depends on what there is around that point in input.

Read about the lexer hack to see how to correctly interpret the Backus Naur form of the C grammar.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM