简体   繁体   English

正则表达式,捕获组和单元测试的良好设计

[英]Good Design For Regex, Capture Groups And Unit Testing

In a project I'm experimenting with using regular expressions to distinguish between various types of sentences and map them to functions to handle these sentences. 在一个项目中,我正在尝试使用正则表达式来区分各种类型的句子,并将它们映射到处理这些句子的函数。

Most of these sentence handling functions take arguments from the sentence itself, parsed out by capture groups in the regular expression. 这些句子处理函数中的大多数都从句子本身获取参数,由正则表达式中的捕获组解析。

Ex: "I paid $20 for 2 cookies" is matched by one of the regular expressions in my parse tree (a dictionary). 例如:“我为2个cookie支付了20美元”与我的解析树(字典)中的一个正则表达式相匹配。 The regex would match extract $20 as the group "price", and 2 as group "amount". 正则表达式将提取20美元作为组“价格”,将2作为组“金额”匹配。 Currently I am mapping to the correct Handler function and calling it as follows: 目前我正在映射到正确的Handler函数并调用它如下:

foreach(KeyValuePair<Regex, Type> pair in sentenceTypes)
{
    Match match = pair.Key.Match(text);
    if(match.Success)
    {
        IHandler handler = handlerFactory.CreateHandler(pair.Value);
        output = handler.Handle(match);
    }
}

Example of a simple handler class. 简单处理程序类的示例。

public class NoteCookiePriceHandler
    {
        public string Handle(Match match)
        {
            double payment = Convert.ToDouble(match.Result("${payment}"));
            int amount = Convert.ToInt32(match.Result("${amount}"));

            double price = payment / amount;
            return "The price is $" + price;
        }
    }

I was trying to set up some unit tests with Moq to help out, when I realized I can't actually mock a Match object, nor a Regex. 我试图用Moq设置一些单元测试来帮助,当我意识到我实际上不能模拟Match对象,也不是正则表达式。 Thinking about it more the design seems somewhat flawed in general, as I am depending on named groups being correctly parsed and handed to the Handler class without a good interface. 考虑到这一点,设计似乎有些缺陷,因为我依赖于命名组被正确解析并传递给Handler类而没有良好的接口。

I am looking for suggestions on a more effective design to use in passing parameters correctly to a mapped handler function/class, as passing the Match object seems problematic. 我正在寻找有关更有效的设计的建议,用于正确地将参数传递给映射的处理函数/类,因为传递Match对象似乎有问题。

Failing that, Any help in figuring out a way to mock Regex or Match effectively would be appreciated, and at least help me solve my short term problem. 如果做不到这一点,任何帮助找出有效模拟正则表达式或匹配的方法都会受到赞赏,至少可以帮助我解决我的短期问题。 They both lack default constructors, and so I am having a hard time getting Moq to create objects of them. 它们都缺少默认构造函数,因此我很难让Moq创建它们的对象。

Edit: I ended up solving at least the mocking problem by passing a dictionary of strings for my match groups, rather than the (un-Moq-able) match object itself. 编辑:我最后通过为我的匹配组传递字符串字典而不是(不可Moq-able)匹配对象本身来解决至少模拟问题。 I'm not particularly happy with this solution, so recommendations would still be appreciated. 我对这个解决方案并不是特别满意,所以建议仍然会受到赞赏。

foreach(KeyValuePair<Regex, Type> pair in sentenceTypes)
        {
            match = pair.Key.Match(text);
            if(match.Success)
            {
                IHandler handler= handlerFactory.CreateHandler(pair.Value);
                foreach (string groupName in pair.Key.GetGroupNames())
                {
                    matchGroups.Add(groupName, match.Groups[groupName].Value);
                }
                interpretation = handler.Handle(matchGroups);

One way to avoid bad design is to start with the principles of good design instead of simply the problem you wish to resolve. 避免糟糕设计的一种方法是从良好设计的原则开始,而不仅仅是您希望解决的问题。 This is one of the reasons why test driven development is so powerful in transforming the quality of code. 这是测试驱动开发在转换代码质量方面如此强大的原因之一。 This way of thinking did exist way before TDD though under the name: design by contract. 这种思维方式在TDD之前确实存在,但名称为:按合同设计。 Allow me to demonstrate: 请允许我证明:

What would you like the ideal handler to look like? 您希望理想的处理程序看起来像什么? How about this: 这个怎么样:

interface IHandler {
    String handle();
}

Implementation: 执行:

public class NoteCookiePriceHandler : IHandler
{  
    private double payment;
    private int amount;

    public NoteCookiePriceHandler(double payment, int amount) {
        this.payment = payment;
        this.amount = amount;
    }

    public String handle() {
        return "The price is $" + payment / amount;
    }
}

Now starting with this ideal design, perhaps with the tests for this design. 现在从这个理想的设计开始,也许是对这个设计的测试。 How can we get the sentence input of the sentences to be sent to the handlers? 如何才能将句子的句子输入发送给处理程序? Well, all problem in computer science can be solved with another layer of indirection. 那么,计算机科学中的所有问题都可以通过另一层间接解决。 Let's say the sentence parser does not create the handler directly, but uses a factory to create one: 假设句子解析器不直接创建处理程序,而是使用工厂创建一个:

interface HandlerFactory<T> where T: IHandler  {
    T getHandler(KeyValuePair<String, String> captureGroups);
}

You could then create one factory per handler, but soon enough you would find a way to create a generic factory. 然后,您可以为每个处理程序创建一个工厂,但很快您就会找到创建通用工厂的方法。 Using reflection for example you could match the capture group name to the constructor parameters. 例如,使用反射可以将捕获组名称与构造函数参数进行匹配。 Based upon the data types of the constructor parameters you could automatically let your generic handler factory convert your strings to the correct data types. 根据构造函数参数的数据类型,您可以自动让通用处理程序工厂将字符串转换为正确的数据类型。 This would all be easily testable by creating some fake handlers and asking the factory to populate them using some key value pair string inputs. 通过创建一些假处理程序并要求工厂使用一些键值对字符串输入填充它们,这一切都可以轻松测试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM