How does a parser for C++ differentiate between comparisons and template instantiations?

Question

In C++, the symbols '<' and '>' are used for comparisons as well as for signifying a template argument. Thus, the code snippet

[...] Foo < Bar > [...]

might be interpreted as any of the following two ways:

An object of type Foo with template argument Bar
Compare Foo to Bar, then compare the result to whatever comes next

How does the parser for a C++ compiler efficiently decide between those two possibilities?

Answer 1

If Foo is known to be a template name (eg a template <...> Foo ... declaration is in scope, or the compiler sees a template Foo sequence), then Foo < Bar cannot be a comparison. It must be a beginning of a template instantiation (or whatever Foo < Bar > is called this week).

If Foo is not a template name, then Foo < Bar is a comparison.

In most cases it is known what Foo is, because identifiers generally have to be declared before use, so there's no problem to decide one way or the other. There's one exception though: parsing template code. If Foo<Bar> is inside a template, and the meaning of Foo depends on a template parameter, then it is not known whether Foo is a template or not. The language standard directs to treat it as a non-template unless preceded by the keyword template .

The parser might implement this by feeding context back to the lexer. The lexer recognizes Foo as different types of tokens, depending on the context provided by the parser.

Answer 2

The important point to remember is that C++ grammar is not context-free. Ie, when the parser sees Foo < Bar (in most cases) knows that Foo refers to a template definition (by looking it up in the symbol table), and thus < cannot be a comparison.

There are difficult cases, when you literally have to guide the parser. For example, suppose that are writing a class template with a template member function, which you want to specialize explicitly. You might have to use syntax like:

 a->template foo<int>();

(in some cases; see Calling template function within template class for details)

Also, comparisons inside non-type template arguments must be surrounded by parentheses, ie:

foo<(A > B)>

not

foo<A > B>

Non-static data member initializers bring more fun: http://open-std.org/JTC1/SC22/WG21/docs/cwg_active.html#325

Answer 3

C and C++ parsers are "context sensitive", in other words, for a given token or lexeme, it is not guaranteed to be distinct and have only one meaning - it depends on the context within which the token is used.

So, the parser part of the compiler will know (by understanding "where in the source it is") that it is parsing some kind of type or some kind of comparison (This is NOT simple to know, which is why reading the source of competent C or C++ compiler is not entirely straight forward - there are lots of conditions and function calls checking "is this one of these, if so do this, else do something else").

The keyword template helps the compiler understand what is going on, but in most cases, the compiler simply knows because < doesn't make sense in the other aspect - and if it doesn't make sense in EITHER form, then it's an error, so then it's just a matter of trying to figure out what the programmer might have wanted - and this is one of the reasons that sometimes, a simple mistake such as a missing } or template can lead the entire parsing astray and result in hundreds or thousands of errors [although sane compilers stop after a reasonable number to not fill the entire universe with error messages]

Answer 4

Most of the answers here confuse determining the meaning of the symbol (what I call "name resolution") with parsing (defined narrowly as "can read the syntax of the program").

You can do these tasks separately. .

What this means is that you can build a completely context-free parser for C++ (as my company, Semantic Designs does), and leave the issues of deciding what the meaning of the symbol is to a explicitly seperate following task.

Now, that task is driven by the possible syntax interpretations of the source code. In our parsers, these are captured as ambiguities in the parse.

What name resolution does is collect information about the declarations of names, and use that information to determine which of the ambiguous parses doesn't make sense, and simply drop those. What remains is a single valid parse, with a single valid interpretation.

The machinery to accomplish name resolution in practice is a big mess. But that's the C++ committee's fault, not the parser or name resolver. The ambiguity removal with our tool is actually done automatically, making that part actually pretty nice but if you don't look inside our tools you would not appreciate that, but we do because it means a small engineering team was able to build it.

See an example of resolution of template-vs-less than on C++s most vexing parse done by our parser.

How does a parser for C++ differentiate between comparisons and template instantiations?

Question

4 answers

solution1
14 ACCPTED 2016-02-14 11:49:42

solution2
10 2016-02-14 12:01:45

solution3
5 2016-02-14 11:52:22

solution4
2 2016-02-14 18:36:11

How does a parser for C++ differentiate between comparisons and template instantiations?

Question

4 answers

solution1 14 ACCPTED 2016-02-14 11:49:42

solution2 10 2016-02-14 12:01:45

solution3 5 2016-02-14 11:52:22

solution4 2 2016-02-14 18:36:11

solution1
14 ACCPTED 2016-02-14 11:49:42

solution2
10 2016-02-14 12:01:45

solution3
5 2016-02-14 11:52:22

solution4
2 2016-02-14 18:36:11