I am using a regex expression like "a.{1000000}b.{1000000}c" to pattern match on a string. However this is WAY too slow. Is there a better way to do this? I am not interested in the stuff between a, b and c, as long as their gap is of my specified size I care not of the content within. One can think of it as skipping n characters. Checking the index doesn't serve me well either, I need to be using some built-in method written in C. Any suggestions?
Thanks in advance
If you just need to verify that a string is in a given pattern and do not care to extract the a, b, nor c then this would work:
(?=^a.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}b.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}.{50000}c$)
The limit for regex quantifiers is 65535
so if you need one million then you would have to repeat .{50000}
20 times like I did above.
Now you just need to make Python code that says "if regex match then proceed"
Regex101 takes 68ms so I would consider that to be "fast".
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.