简体   繁体   English

如何在任意序列中搜索模式?

[英]How to search patterns in arbitrary sequences?

Regex is on string's only, but what if that functionality can be extended to not only character but objects or even further to functions? Regex仅适用于字符串,但如果该功能不仅可以扩展到字符而且可以扩展到对象甚至更多功能呢? Suppose our object's will be integers, they can be in any order: 假设我们的对象是整数,它们可以是任何顺序:

1 2 3 4 5 6 7 8 9 10 11 12 13

And the task you want to solve is to find prime pairs (or similar pattern search task) like this: 您要解决的任务是找到素数对 (或类似模式搜索任务),如下所示:

{prime}{anyNumber}{prime}

So the answer is this: 所以答案是这样的:

(3,4,5) (5,6,7) (11,12,13)

Or a little more complex example for chain of primes: 或者是素数链的一个更复杂的例子:

{prime}({anyNumber}{prime})+

Answer: 回答:

(3,(4,5),(6,7)) (11,(12,13))

Pretty much like Regex work, right? 非常像正则表达式的工作,对吧?

What happens is that you define some function named isPrime(x) and use it when you need to check if next input element is actualy prime (so it is some sort of equality to object or object space) 会发生的是你定义了一个名为isPrime(x)的函数,并在需要检查下一个输入元素是否为实数素数时使用它(因此它与对象或对象空间有某种相等性)

What I created so far 到目前为止我创造了什么

I created ObjectRegex class similar to Regex class in C#. 我在C#中创建了类似于Regex类的ObjectRegex类。 It accepts patterns in above and execute predicate asociated with it to identify object. 它接受上面的模式并执行与之相关的谓词来识别对象。 It works perfectly fine, but the problem is for it to work any sequence of type TValue should be converted to string before it will be passed to Regex pattern and for that I should apply ALL predicates to entire sequence. 它工作得很好,但问题是它可以工作任何类型的TValue序列应该转换为字符串才能传递给Regex模式,为此我应该将所有谓词应用于整个序列。 O(n*m) is a bad idea afterall.... O(n * m)毕竟是一个坏主意....

I decided to go around it the hard way and....try to inherit string, which is sealed and inheritance is forbidden. 我决定以艰难的方式解决它....并尝试继承字符串,这是密封的,继承被禁止。 What is needed from this inherited class is override accessor 从这个继承的类中需要的是覆盖访问器

char this[int index] {get;}

for the benefit of deferred execution of predicates to moment when it actualy make sense. 为了延迟执行谓词的好处,它实际上是有意义的。

So, any idea how to make it? 那么,任何想法如何制作呢? I love .NET Regex and it's syntax, is there a way to go around this string curse and deceive engine? 我喜欢.NET Regex和它的语法,有没有办法绕过这个字符串诅咒和欺骗引擎? Reflection maybe or some hardcore I don't know? 反思可能还是一些我不知道的铁杆?

Update 1 更新1

I found this article http://www.codeproject.com/Articles/463508/NET-CLR-Injection-Modify-IL-Code-during-Run-time and think it can be done through replacement of this[int index] method by my code, but i think it will corrupt everything else, cause you just can't replace method for only one instance. 我发现这篇文章http://www.codeproject.com/Articles/463508/NET-CLR-Injection-Modify-IL-Code-during-Run-time并认为可以通过替换这个[int index]方法来完成通过我的代码,但我认为它会破坏其他一切,因为你只能替换一个实例的方法。

String inheritance 字符串继承

After some research, I found that idea to optimize existing Regex is impossible. 经过一些研究,我发现优化现有正则表达式的想法是不可能的。 This is because even if I know index in string, I still don't have access to possible states in Regex automaton, which I should look to filter unneccesary calculations. 这是因为即使我知道字符串中的索引,我仍然无法访问正则表达式自动机中的可能状态,我应该尝试过滤不必要的计算。

ORegex ORegex

As to answer, I decided to implement my own engine similar to Microsoft Regex engine. 至于回答,我决定实现类似于Microsoft Regex引擎的我自己的引擎。 Syntax is the same as Microsoft Regex syntax. 语法与Microsoft Regex语法相同。 You can find more information and examples at Nuget and github : 您可以在Nugetgithub上找到更多信息和示例:

Currently, it supports basic Regex engine features and also some of popular features like lookahead and capturing. 目前,它支持基本的Regex引擎功能以及一些流行的功能,如超前和捕获。

Example

public static bool IsPrime(int number)
{
    int boundary = (int)Math.Floor(Math.Sqrt(number));
    if (number == 1) return false;
    if (number == 2) return true;
    for (int i = 2; i <= boundary; ++i)
    {
        if (number % i == 0) return false;
    }
    return true;
}

public void PrimeTest()
{
    var oregex = new ORegex<int>("{0}(.{0})*", IsPrime);
    var input = new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13};
    foreach (var match in oregex.Matches(input))
    {
        Trace.WriteLine(string.Join(",", match.Values));
    }
}

//OUTPUT:
//2
//3,4,5,6,7
//11,12,13

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM