简体   繁体   中英

Ternary Operator Parsing with the Shunting Yard Algorithm

For context, please read this question about Ternary Operators first.

I am building my own programming language that allows you to define custom operators. Because I want it to have as few compiler built-ins as possible, it should allow the definition of custom ternary operators, preferably in the form

infix operator ? : { precedence 120 }

My (hand-written) Expression Parser will turn nested ternary operators into a list of operands separated by operators.

a ? b ? c : d : e
(a) ? (b) ? (c) : (d) : (d)
OperatorChain(operators: [?, ?, :, :], operands: [a, b, c, d, e])

The OperatorChain class now looks up the operators from operator definitions in scope and converts the list into binary AST nodes using a modified version of the shunting yard algorithm, which is shown below:

// Note: OperatorElement is a class that merely stores an Identifier, an associated source code position and the resolved operator.
// IValue is the base interface for all Expression AST nodes

final Stack<OperatorElement> operatorStack = new LinkedList<>();
final Stack<IValue> operandStack = new LinkedList<>();
operandStack.push(this.operands[0]);

for (int i = 0; i < this.operatorCount; i++)
{
    final OperatorElement element1 = this.operators[i];
    OperatorElement element2;
    while (!operatorStack.isEmpty())
    {
        element2 = operatorStack.peek();

        final int comparePrecedence = element1.operator.comparePrecedence(element2.operator);
        if (comparePrecedence < 0
                || element1.operator.getAssociativity() != IOperator.RIGHT && comparePrecedence == 0)
        {
            operatorStack.pop();
            this.pushCall(operandStack, element2);
        }
        else
        {
            break;
        }
    }
    operatorStack.push(element1);
    operandStack.push(this.operands[i + 1]);
}
while (!operatorStack.isEmpty())
{
    this.pushCall(operandStack, operatorStack.pop());
}

return operandStack.pop().resolve(markers, context);

How would I need to modify this algorithm to work with ternary operators, including custom ones?

I've implemented a mathematical parser in java which can handle ternary operators. The heart of this is the expression method. The input is contained in an iterator it with a it.peekNext() method to view the next token and it.consume() move to the next token. It calls the prefixSuffix() to read constants and variables with possible prefix and suffix operators eg ++x .

protected void expression() throws ParseException  {

    prefixSuffix(); 

    Token t = it.peekNext();
    while(t!=null) {
        if(t.isBinary()) {
            OperatorToken ot = (OperatorToken)t;
            Operator op = ot.getBinaryOp();
            pushOp(op,ot);
            it.consume();
            prefixSuffix();
        }
        else if(t.isTernary()){
            OperatorToken ot =(OperatorToken)t;
            Operator to = ot.getTernaryOp();
            pushOp(to,ot);
            it.consume();
            prefixSuffix();
        }
        else
            break;
        t=it.peekNext();
    }
    // read all remaining elements off the stack
    while(!sentinel.equals(ops.peek())) {
        popOp();
    }
}

So when either of the tokens is encountered it calls the pushOp methods to push them on the stack. Each token has an associated operator which is also parsed to pushOp .

pushOp compare the new operator with the top of the stack, poping if necessary

protected void pushOp(Operator op,Token tok) throws ParseException 
{
    while(compareOps(ops.peek(),op))
        popOp();
    ops.push(op); 
}

The logic for dealing with tenary operators happen in compareOps :

/**
 * Compare operators based on their precedence and associativity.
 * @param op1
 * @param op2
 * @return true if op1 has a lower precedence than op2, or equal precedence and a left assoc op, etc.
 */
protected boolean compareOps(Operator op1,Operator op2)
{
    if(op1.isTernary() && op2.isTernary()) {
        if(op1 instanceof TernaryOperator.RhsTernaryOperator &&
                op2 instanceof TernaryOperator.RhsTernaryOperator )
            return true;
        return false;
    }
    if(op1 == sentinel ) return false;
    if(op2 == sentinel ) return true;
    if(op2.isPrefix() && op1.isBinary()) return false;
    if(op1.getPrecedence() < op2.getPrecedence()) return true;
    if(op1.getPrecedence() == op2.getPrecedence() && op1.isLeftBinding()) return true;
    return false;
}

If both operators are the right hand of a ternary operator then compareOps returns true one operator will be popped. Otherwise if both are the left hand ternary operators or one is a left and one is a right then compareOps returns false and no operators are popped.

The other bit of handling happens in the popOp method:

protected void popOp() throws ParseException
{
    Operator op = ops.pop();

    if(op == implicitMul) {
        Node rhs = nodes.pop();
        Node lhs = nodes.pop();
        Node node = nf.buildOperatorNode(
                jep.getOperatorTable().getMultiply(), 
                lhs, rhs);
        nodes.push(node);
    }
    else if(op.isBinary()) {
        Node rhs = nodes.pop();
        Node lhs = nodes.pop();
        Node node = nf.buildOperatorNode(op, lhs, rhs);
        nodes.push(node);
    }
    else if(op.isUnary()) {
        Node lhs = nodes.pop();
        Node node = nf.buildOperatorNode(op, lhs);
        nodes.push(node);
    }
    else if(op.isTernary() && op instanceof TernaryOperator.RhsTernaryOperator ) {
        Operator op2 = ops.pop();
        if(!(op2 instanceof TernaryOperator ) || !((TernaryOperator) op2).getRhsOperator().equals(op)) {
            throw new ParseException(
                    MessageFormat.format(JepMessages.getString("configurableparser.ShuntingYard.NextOperatorShouldHaveBeenMatchingTernaryOp"),op2.getName())); //$NON-NLS-1$

        }

        Node rhs = nodes.pop();
        Node middle = nodes.pop();
        Node lhs = nodes.pop();
        Node node = nf.buildOperatorNode(op2, new Node[]{lhs,middle,rhs});
        nodes.push(node);
    }
    else {
        throw new ParseException(MessageFormat.format(JepMessages.getString("configurableparser.ShuntingYard.InvalidOperatorOnStack"),op.getSymbol())); //$NON-NLS-1$
    }
}

Only the right hand side of the ternary operator should be encountered. The operator directly below it should be the corresponding left hand operator. (Any operator with a lower precedence will have already been popped, the only operators with higher precedence are assignment operators which can't occur inside a ternary operator).

Both the left and right hands operators are now popped. I'm constructing a tree and the last three nodes produced are removed with a new ternary operator node constructed.

The following is not exactly what the OP was asking for, but there is another pragmatic solution which might help someone coming across...

I had a pretty standard shunting yard algorithm with prefix/2-operand-infix support, and needed it to additionally handle just the one ternary operator ?: (which is probably the only one out there, anyway... ;-) ). I also didn't care about associativity or precedence since writing something like a? b: c? d: e a? b: c? d: e a? b: c? d: e is just not nice and I wanted it to be explicitly forbidden. (Brackets should be used in this case: a? b: (c? d: e) or (a? b: c)? d: e .)

Given this, there is a simple solution:

  1. Create a new (2-operand) infix operator ? .
  2. Create a new (2-operand) infix operator : with lower precedence than the ? precedence.

Both precedences should be lower than those of + / - / * etc. to parse a > 1? x + 4: y + 5 a > 1? x + 4: y + 5 as (a > 1)? (x + 4): (y + 5) (a > 1)? (x + 4): (y + 5) .

Now you'll get a parse tree with elements looking like this:

在此处输入图像描述

  1. Do a post-processing step replacing each element of this type ( : node with left child ? ) with another custom node ?:(c, a, b) .

在此处输入图像描述

You'll end up with a tree containing ? or : nodes if and only if

  • there are nested ternary operations in the code that don't have proper brackets or
  • there is some syntactical misuse of the ? or : operators.

Print an error message in that case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM