[英]reversed regex mashine implementation
I'm trying to match a string starting from the last character to fail as soon as possible.我试图尽快匹配从最后一个字符开始的字符串以失败。 This way I can fail a match with a custom string cstr
(see specification below) with least amount of operations (4th property).这样我就可以用最少的操作(第 4 个属性)使与自定义字符串cstr
(参见下面的规范)的匹配失败。
From a theoritical perspective the regex can be represented as a finite state mashine and the arrows can be flipped, creating the reversed regex.从理论的角度来看,正则表达式可以表示为有限的 state mashine,箭头可以翻转,从而创建反向正则表达式。
I'm looking for an implementation of this.我正在寻找这个的实现。 A library/program which I can give the string and the pattern.我可以提供字符串和模式的库/程序。 cstr
is implemented in python, so if possible a python module. cstr
在 python 中实现,因此如果可能的话,一个 python 模块。 (For the curious i-th character is not calculated until needed.) For anything other I need to do much more work because of cstr
's calculation is hard to port to another language. (因为奇怪的第 i 个字符直到需要时才计算。)对于任何其他我需要做更多的工作,因为cstr
的计算很难移植到另一种语言。
The implementation doesn't have to cover all latex syntax.实现不必涵盖所有 latex 语法。 I'm looking for the basics.我正在寻找基础知识。 No lookaheads or fancy stuff.没有前瞻性或花哨的东西。 See specification below.请参阅下面的规格。
I may be lacking common knowledge.我可能缺乏常识。 Please do comment obvious things, too.也请评论明显的事情。
The custom string cstr
has the following properties:自定义字符串cstr
具有以下属性:
When the string is calcualted fully, I want to match it with a simple regex which may contain these from the syntax.当字符串被完全计算后,我想用一个简单的正则表达式来匹配它,它可能包含语法中的这些。 No look aheads or fancy stuff.没有前瞻性或花哨的东西。
.
, *
, +
, ?
, *
, +
, ?
, \w
, \W
, []
, |
, \w
, \W
, []
, |
, escape char \
, range specifitation with {, }
, 转义字符\
, 用{, }
指定范围PS: This is not a homework question. PS:这不是作业题。 I'm trying to formulate my question as clear as possible.我试图尽可能清楚地表达我的问题。
OP here. OP在这里。 Here are some thougts:以下是一些想法:
Since I'm looking for an unoptimized regex mashine, I have to build it myself, which takes time.由于我正在寻找未优化的正则表达式机器,因此我必须自己构建它,这需要时间。
Alternatively we can define an upperbound for cstr
length and create all strings that matches given regex with length < upperbound.或者,我们可以为cstr
长度定义上限,并创建与给定正则表达式匹配且长度 < 上限的所有字符串。 Then we put all solutions to a tire data structure and match it.然后我们将所有的解决方案放到一个轮胎数据结构中并匹配它。 This depends on the use case and maybe a cache can be involved.这取决于用例,并且可能涉及缓存。
What I'm going for is python module greenery我要的是python模块greenery
from greenery import parse
pattern = parse.Pattern(...)
pattern.reversed()
...
this sometimes provieds a good matching experience.这有时会提供很好的匹配体验。 Sometimes not but it is ok for me.有时不是,但对我来说没关系。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.