[英]How can I optimize this regular expression?
我有一个正则表达式,我想问一下是否可以简化它?
preg_match_all('/([0-9]{2}\.[0-9]{2}\.[0-9]{4}) (([01]?[0-9]|2[0-3])\:[0-5][0-9]\:[0-5][0-9]?) поступление на сумму (\d+) WM([A-Z]) от корреспондента (\d+)/', $message->getMessageBody(), $info);
I think this is the best you can do: 我认为这是您可以做的最好的事情:
preg_match_all('/((?:\d\d\.){2}\d{4}) (([01]?\d|2[0-3])(:[0-5]\d){1,2}) поступление на сумму (\d+) WM([A-Z]) от корреспондента (\d+)/', $message, $info);
Unless you don't need those exact words in there. 除非您在那里不需要这些确切的单词。 Then you could:
然后,您可以:
preg_match_all('/((?:\d\d\.){2}\d{4}) (([01]?\d|2[0-3])(:[0-5]\d){1,2})\D+(\d+) WM([A-Z])\D+(\d+)/', $message, $info);
You can start by using free-spacing mode and some comments (which will help your and everyone else's understanding - which makes simplifying easier). 您可以先使用自由行距模式和一些注释(这将有助于您和其他所有人的理解-简化变得容易)。 Note that you'll have to put literal spaces in parentheses now, though:
注意,尽管如此,您现在必须在括号中放置文字空间:
/
( # group 1
[0-9]{2}\.[0-9]{2}\.[0-9]{4}
# match a date
)
[ ]
( # group 2
( # group 3
[01]?[0-9]# match an hour from 0 to 19
| # or
2[0-3] # match an hour from 20 to 23
)
\:
[0-5][0-9] # minutes
\:
[0-5][0-9]? # seconds
)
[ ]поступление[ ]на[ ]сумму[ ]
# literal text
(\d+) # a number into group 4
[ ]WM # literal text
([A-Z]) # a letter into group 5
[ ]от[ ]корреспондента[ ]
# literal text
(\d+) # a number into group 6
/x
Now we can't simplify the part at the end - unless you don't want to capture the parenthesised things, in which case you can simply omit most of the parentheses. 现在,我们无法简化最后的部分-除非您不想捕获带括号的内容,在这种情况下,您可以简单地省略大多数括号。
You can slightly shorten the expression, by using \\d
as a substitute for \\d
, in which case \\d\\d
is even shorter than \\d{2}
. 通过使用
\\d
代替\\d
可以稍微缩短表达式,在这种情况下\\d\\d
甚至比\\d{2}
更短。
Next, there is no need to escape colons. 接下来,无需逃避冒号。
And finally, there seems to be something odd with your seconds. 最后,您的秒数似乎有些奇怪。 If you want to allow single-digit seconds, make the
0-5
optional, and not the the \\d
after it: 如果要允许一位数秒,请将
0-5
设为可选,而不是其后的\\d
:
/
( # group 1
\d\d\.\d\d\.\d{4}
# match a date
)
[ ]
( # group 2
( # group 3
[01]?\d # match an hour from 0 to 19
| # or
2[0-3] # match an hour from 20 to 23
)
:
[0-5]\d # minutes
:
[0-5]?\d # seconds
)
[ ]поступление[ ]на[ ]сумму[ ]
# literal text
(\d+) # a number into group 4
[ ]WM # literal text
([A-Z]) # a letter into group 5
[ ]от[ ]корреспондента[ ]
# literal text
(\d+) # a number into group 6
/x
I don't think it will get much simpler than that. 我认为这不会比这简单得多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.