I have this RegEx pattern
^(\\d|\\w)+\\..*
and this is my input
(1) nu11111111111111
(2) nu1111111111111111111
(3) nu1111111111111111111111111111111111111
Time has taken by input 2 is higher than input 1 and It returns Not Matched result. But for input 3, I didn't get any response even after 30 min of execution. I am observing the memory as well and it increases continuously.
Below is my code snippet:
String input1 = "nu11111111111111";
String input2 = "nu1111111111111111111";
String input3 = "nu1111111111111111111111111111111111111";
try
{
if (input3.matches("^(\\d|\\w)+\\..*"))
{
System.out.println("Matched");
}
else
{
System.out.println("Not Matched");
}
}
catch (Exception e)
{
e.printStackTrace();
}
This is another case of catastrophic backtracking, as \\d
is already included in \\w
. As there is no match to be found, the regex engine tries to backtrack into every possible combination of matching either \\w
or \\d
against your series of 1
s - which is quite a lot.
To get a little insight into what is happening, see https://regex101.com/r/4fRRpc/1/ and go to the regex debugger. This uses a PCRE pattern without startup optimizations, which should be pretty similar to what java appears to do in this case.
For your regex, use ^\\\\w+\\\\..*
instead.
That Java regex engine is pathetic.
› time perl -E'say /^(\d|\w)+\..*/ ? "Matched" : "Not Matched" for qw(nu11111111111111 nu1111111111111111111 nu1111111111111111111111111111111111111)'
Not Matched
Not Matched
Not Matched
real 0m0,009s
user 0m0,006s
sys 0m0,003s
Try RE2 , it does not backtrack and has Java bindings.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.