简体   繁体   English

如何优化此正则表达式?

[英]How can I optimize this regular expression?

我有一个正则表达式,我想问一下是否可以简化它?

preg_match_all('/([0-9]{2}\.[0-9]{2}\.[0-9]{4}) (([01]?[0-9]|2[0-3])\:[0-5][0-9]\:[0-5][0-9]?) поступление на сумму (\d+) WM([A-Z]) от корреспондента (\d+)/', $message->getMessageBody(), $info);

I think this is the best you can do: 我认为这是您可以做的最好的事情:

preg_match_all('/((?:\d\d\.){2}\d{4}) (([01]?\d|2[0-3])(:[0-5]\d){1,2}) поступление на сумму (\d+) WM([A-Z]) от корреспондента (\d+)/', $message, $info);

Unless you don't need those exact words in there. 除非您在那里不需要这些确切的单词。 Then you could: 然后,您可以:

preg_match_all('/((?:\d\d\.){2}\d{4}) (([01]?\d|2[0-3])(:[0-5]\d){1,2})\D+(\d+) WM([A-Z])\D+(\d+)/', $message, $info);

You can start by using free-spacing mode and some comments (which will help your and everyone else's understanding - which makes simplifying easier). 您可以先使用自由行距模式和一些注释(这将有助于您和其他所有人的理解-简化变得容易)。 Note that you'll have to put literal spaces in parentheses now, though: 注意,尽管如此,您现在必须在括号中放置文字空间:

/
(             # group 1
  [0-9]{2}\.[0-9]{2}\.[0-9]{4}
              # match a date
)
[ ]
(             # group 2
  (           # group 3
    [01]?[0-9]# match an hour from 0 to 19
  |           # or
    2[0-3]    # match an hour from 20 to 23
  )
  \:       
  [0-5][0-9]  # minutes
  \:
  [0-5][0-9]? # seconds
)
[ ]поступление[ ]на[ ]сумму[ ]
              # literal text
(\d+)         # a number into group 4
[ ]WM         # literal text
([A-Z])       # a letter into group 5
[ ]от[ ]корреспондента[ ]
              # literal text
(\d+)         # a number into group 6
/x

Now we can't simplify the part at the end - unless you don't want to capture the parenthesised things, in which case you can simply omit most of the parentheses. 现在,我们无法简化最后的部分-除非您不想捕获带括号的内容,在这种情况下,您可以简单地省略大多数括号。

You can slightly shorten the expression, by using \\d as a substitute for \\d , in which case \\d\\d is even shorter than \\d{2} . 通过使用\\d代替\\d可以稍微缩短表达式,在这种情况下\\d\\d甚至比\\d{2}更短。

Next, there is no need to escape colons. 接下来,无需逃避冒号。

And finally, there seems to be something odd with your seconds. 最后,您的秒数似乎有些奇怪。 If you want to allow single-digit seconds, make the 0-5 optional, and not the the \\d after it: 如果要允许一位数秒,请将0-5设为可选,而不是其后的\\d

/
(             # group 1
  \d\d\.\d\d\.\d{4}
              # match a date
)
[ ]
(             # group 2
  (           # group 3
    [01]?\d   # match an hour from 0 to 19
  |           # or
    2[0-3]    # match an hour from 20 to 23
  )
  :       
  [0-5]\d     # minutes
  :
  [0-5]?\d    # seconds
)
[ ]поступление[ ]на[ ]сумму[ ]
              # literal text
(\d+)         # a number into group 4
[ ]WM         # literal text
([A-Z])       # a letter into group 5
[ ]от[ ]корреспондента[ ]
              # literal text
(\d+)         # a number into group 6
/x

I don't think it will get much simpler than that. 我认为这不会比这简单得多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM