What does a parser for C++ do until it can differentiate between comparisons and template instantiations?

Question

After reading this question I am left wondering what happens (regarding the AST) when major C++ compilers parse code like this:

struct foo 
{
  void method() { a<b>c; }

  // a b c may be declared here
};

Do they handle it like a GLR parser would or in a different way? What other ways are there to parse this and similar cases?

For example, I think it's possible to postpone parsing the body of the method until the whole struct has been parsed, but is this really possible and practical?

Answer 1

The answer will obviously depend on the compiler, but the article How Clang handles the type / variable name ambiguity of C/C++ by Eli Bendersky explains how Clang does it. I will simply note some key points from the article:

Clang has no need for a lexer hack: the information goes in a single direction from lexer to parser
Clang knows when an identifier is a type by using a symbol table
C++ requires declarations to be visible throughout the class, even in code that appears before it
Clang gets around this by doing a full parsing/semantic analysis of the declaration, but leaving the definition for later; in other words, it's lexed but parsed after all the declared types are available

Answer 2

Although it is certainly possible to use GLR techniques to parse C++ (see a number of answers by Ira Baxter ), I believe that the approach commonly used in commonly-used compilers such as gcc and clang is precisely that of deferring the parse of function bodies until the class definition is complete. (Since C++ source code passes through a preprocessor before being parsed, the parser works on streams of tokens and that is what must be saved in order to reparse the function body. I don't believe that it is feasible to reparse the source code.)

It's easy to know when a function definition is complete, since braces ( {} ) must balance even if it is not known how angle brackets nest.

C++ is not the only language in which it is useful to defer parsing until declarations have been handled. For example, a language which allows users to define new operators with different precedences would require all expressions to be (re-)parsed once the names and precedences of operators are known. A more pathological example is COBOL, in which the precedence of OR in a = b OR c depends on whether c is an integer ( a is equal to one of b or c ) or a boolean ( a is equal to b or c is true). Whether designing languages in this manner is a good idea is another question.

What does a parser for C++ do until it can differentiate between comparisons and template instantiations?

Question

2 answers

solution1
2 2018-10-26 16:37:26

solution2
2 ACCPTED 2018-10-26 16:43:35

What does a parser for C++ do until it can differentiate between comparisons and template instantiations?

Question

2 answers

solution1 2 2018-10-26 16:37:26

solution2 2 ACCPTED 2018-10-26 16:43:35

solution1
2 2018-10-26 16:37:26

solution2
2 ACCPTED 2018-10-26 16:43:35