简体   繁体   English

带反向引用的正则表达式的时间复杂度

[英]Time complexity of regular expression with backreferencing

What is the time complexity of a regular expression like this (to be operated on any string): "(.{5})\\1\\1" 这样的正则表达式(可在任何字符串上操作)的时间复杂度是多少:“(。{5})\\ 1 \\ 1”

The implementation I am having is: 我所拥有的实现是:

reps <- function(s, n) paste(rep(s, n), collapse = "") # repeat s n times

find.string <- function(string, th = 3, len = floor(nchar(string)/th)) {
    for(k in len:1) {
        pat <- paste0("(.{", k, "})", reps("\\1", th-1))
        r <- regexpr(pat, string, perl = TRUE)
        if (attr(r, "capture.length") > 0) break
    }
    if (r > 0) substring(string, r, r + attr(r, "capture.length")-1) else ""
}

Please do help out. 请帮忙。 Thanks! 谢谢! :) :)

It depends on the implementation. 这取决于实现方式。 This is not a regular expression in the strict definition of the word because of the backreferences, but it looks like it is worst case O(15 * length(string)) 由于存在反向引用,因此在严格的单词定义中这不是正则表达式,但看起来是最坏的情况O(15 * length(string))

Explanation: The regex engine will try to match starting from position 0,1,2,3,4..last position in the string. 说明:正则表达式引擎将从字符串中的位置0、1、2、3、4..last开始尝试匹配。 Since there is no constraint (dot character) it will match any first 5 characters and then will try to match them again twice, worst case doing 15 queries and then failing. 由于没有约束(点字符),它将匹配任何前5个字符,然后将尝试再次匹配它们两次,最坏的情况是执行15个查询,然后失败。 Then it will move to the 2nd position in the string and try to do this all over again. 然后它将移至字符串中的第二个位置,然后再次尝试执行此操作。 So, in the worst case it will do this len(string) times. 因此,在最坏的情况下,它将执行len(string)次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM