简体   繁体   English

在Perl中解码正则表达式

[英]Decoding regular expression in Perl

任何人都可以解码此正则表达式在Perl中的含义:

while (/([0-9a-zA-Z\\-]+(?:'[a-zA-Z0-9\\-]+)*)/g)

Here is a breakdown of the regex: 这是正则表达式的细分:

(                     # start a capturing group (1)
   [0-9a-zA-Z-]+      # one or more digits or letters or hyphens
   (?:                # start a non-capturing group
      '               # a literal single quote character
      [a-zA-Z0-9-]+   # one or more digits or letters or hyphens
   )*                 # repeat non-capturing group zero or more times
)                     # end of capturing group 1

The regex is in the form /.../g and in a while loop, which means that the code inside of the while will be run for each non-overlapping match of the regex. regex的格式为/.../g并处于while循环中,这意味着while内的代码将针对regex的每个非重叠匹配运行。

There's a tool for that: YAPE::Regex::Explain 有一个专用的工具: YAPE :: Regex :: Explain

The regular expression:

(?-imsx:([0-9a-zA-Z\-]+(?:'[a-zA-Z0-9\-]+)*))

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [0-9a-zA-Z\-]+           any character of: '0' to '9', 'a' to
                             'z', 'A' to 'Z', '\-' (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
----------------------------------------------------------------------
      '                        '\''
----------------------------------------------------------------------
      [a-zA-Z0-9\-]+           any character of: 'a' to 'z', 'A' to
                               'Z', '0' to '9', '\-' (1 or more times
                               (matching the most amount possible))
----------------------------------------------------------------------
    )*                       end of grouping
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

FJ's answer is a perfect breakdown. FJ的答案是完美的分解。 But... he left out an important piece, which is the /g at the end. 但是...他遗漏了重要的部分,最后是/ g。 It tells the parser to continue where it left off from last time. 它告诉解析器从上次中断的地方继续。 So the while loop will continue to loop over the string repeatedly until it gets the the point where there are no other points that match. 因此,while循环将继续反复遍历字符串,直到到达没有其他点匹配的点为止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM