python，regex，使用重复字符匹配字符串

Question

I am trying to search Apache log files for specific entries related to specific vulnerability scans. 我正在尝试在Apache日志文件中搜索与特定漏洞扫描相关的特定条目。 I need to match strings from a separate file against the URI content in the weblogs. 我需要将单独文件中的字符串与网络日志中的URI内容进行匹配。 Some of the strings I am trying to find contain repeating special characters like '?'. 我尝试查找的某些字符串包含重复的特殊字符，例如'？'。

For example, I need to be able to match an attack that contains just the string '????????' 例如，我需要能够匹配仅包含字符串“ ????????”的攻击 but I don't want to be alerted on the string '??????????????????' 但我不想在字符串“ ??????????????????”上收到警告 because each attack is tied to a specific attack ID number. 因为每种攻击都与特定的攻击ID号相关联。 Therefore, using: 因此，使用：

if attack_string in log_file_line:
    alert_me()

...will not work. ...不管用。 Because of this, I decided to put the string into a regex: 因此，我决定将字符串放入正则表达式中：

if re.findall(r'\%s' % re.escape(attack_string),log_file_line):
    alert_me()

...which did not work either because any log file line containing the string '????????' ...这也不起作用，因为任何包含字符串'????????'的日志文件行 is matched even if there are more than 8 '?' 即使超过8个'？' in the log file line. 在日志文件行中。

I then tried adding boundaries to the regex: 然后，我尝试为正则表达式添加边界：

if re.findall(r'\\B\%s\\B' % re.escape(attack_string),log_file_line):
    alert_me()

...which stopped matching in both cases. ...在两种情况下都停止匹配。 I need to be able to dynamically assign the string I am looking for but I don't want to match on just any line that contains the string. 我需要能够动态分配要查找的字符串，但我不想仅在包含该字符串的任何行上进行匹配。 How can I accomplish this? 我该怎么做？

Answer 1

How about: 怎么样：

(?:[^?]|^)\?{8}(?:[^?]|$)

Explanation: 说明：

(?-imsx:(?:[^?]|^)\?{8}(?:[^?]|$))

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (?:                      group, but do not capture:
----------------------------------------------------------------------
    [^?]                     any character except: '?'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    ^                        the beginning of the string
----------------------------------------------------------------------
  )                        end of grouping
----------------------------------------------------------------------
  \?{8}                    '?' (8 times)
----------------------------------------------------------------------
  (?:                      group, but do not capture:
----------------------------------------------------------------------
    [^?]                     any character except: '?'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    $                        before an optional \n, and the end of
                             the string
----------------------------------------------------------------------
  )                        end of grouping
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

python，regex，使用重复字符匹配字符串

问题描述

1 个解决方案

解决方案1
1 2012-10-04 08:12:14

python，regex，使用重复字符匹配字符串

问题描述

1 个解决方案

解决方案1 1 2012-10-04 08:12:14

解决方案1
1 2012-10-04 08:12:14