简体   繁体   中英

Text query parsing in Sprache

I'm trying to write some code to match strings based on a pattern:

pattern: "dog and (cat or goat)"

test string: "doggoat" result: true

test string: "dogfrog" result: false

I'm trying to write a parser using Sprache, with most of the logic provided by Corey's excellent answer to a similar problem. I'm almost there, but I'm getting an exception when the code is run:

'The binary operator AndAlso is not defined for the types System.Func 2[System.String,System.Boolean]' and ''System.Func`2[System.String,System.Boolean]'.'

I understand that this means that I need to combine the lambdas at the expression tree nodes with the logical operators, which I've attempted using an ExpressionVisitor based on the answer to another question here . However, the program crashes before the ExpressionVisitor is executed - it appears that the Parse command is executed first, but I don't quite understand why (maybe it's because the Sprache.Parse.Select statement doesn't force execution of the lambda?), or how to force it to be executed first.
Sample code is below (I've stripped all operators but the 'and' out for brevity, reintroducing them from Corey's template is trivial. Sprache must be added from NuGet for the code to compile.

class Program
{
    static void Main(string[] args)
    {
        var patternString = "dog and cat";

        var strTest = "dog cat";
        var strTest2 = "dog frog";

        var conditionTest = ConditionParser.ParseCondition(patternString);

        var fnTest = conditionTest.Compile();
        bool res1 = fnTest(strTest); //true
        bool res2 = fnTest(strTest2); //false
    }
}

public static class ConditionParser
{
    static ParameterExpression Param = Expression.Parameter(typeof(string), "_");

    public static Expression<Func<string, bool>> ParseCondition(string text)
    {
        return Lambda.Parse(text);
    }

    private static Parser<Expression<Func<string, bool>>> Lambda
    {
        get
        {
            var reduced = AndTerm.End().Select(delegate (Expression body)
            {
                var replacer = new ParameterReplacer(Param);
                return Expression.Lambda<Func<string, bool>>((BinaryExpression)replacer.Visit(body), Param);
            });

            return reduced;
        }
    }

    static Parser<Expression> AndTerm =>
        Parse.ChainOperator(OpAnd, StringMatch, Expression.MakeBinary);
    // Other operators (or, not etc.) can be chained here, between AndTerm and StringMatch

    static Parser<ExpressionType> OpAnd = MakeOperator("and", ExpressionType.AndAlso);

    private static Parser<Expression> StringMatch =>
        Parse.Letter.AtLeastOnce()
        .Text().Token()
        .Select(value => StringContains(value));

    static Expression StringContains(string subString)
    {
        MethodInfo contains = typeof(string).GetMethod("Contains");

        var call = Expression.Call(
            Expression.Constant(subString),
            contains,
            Param
        );

        var ret = Expression.Lambda<Func<string, bool>>(call, Param);
        return ret;
    }

    // Helper: define an operator parser
    static Parser<ExpressionType> MakeOperator(string token, ExpressionType type)
        => Parse.IgnoreCase(token).Token().Return(type);
}

internal class ParameterReplacer : ExpressionVisitor
{
    private readonly ParameterExpression _parameter;

    protected override Expression VisitParameter(ParameterExpression node)
    {
        return base.VisitParameter(_parameter);
    }

    internal ParameterReplacer(ParameterExpression parameter)
    {
        _parameter = parameter;
    }
}

There are several issues with your code, but the main problem causing the exception in question is the StringContains method which returns lambda expression. And Expression.AndAlso (as well as most Expression methods) are based on simple non lambda expressions (or lambda expression bodies). The whole idea of the parsing code is to identify and combine simple expressions, and make a single lambda expression from the resulting expression.

To fix the original problem, the StringContains method should return directly the MethodCall expression rather than lambda expression.

The second problem in the same StringContains method is that it reverses the arguments to string.Contains . It basically does token.Contains(parameter) while according to the expected results it should do the opposite.

The whole method (using another handy Expression.Call overload) can be reduced to

static Expression StringContains(string subString) =>
    Expression.Call(Param, "Contains", Type.EmptyTypes, Expression.Constant(subString));

Now everything should work as expected.

However, since the ConditionParser class is using a single ParameterExpression instance, which then is used to build the lambda expression, there is no need for ParameterReplacer , so the Lambda method (property) can be reduced to

private static Parser<Expression<Func<string, bool>>> Lambda =>
    AndTerm.End().Select(body => Expression.Lambda<Func<string, bool>>(body, Param));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM