I will just ask on how to speed-up re.search on python.
I have a long string line, which is 176861 of length (ie alphanumeric characters with some symbols) and I tested this line for an re.search using this function:
def getExecTime():
start_time = time.time()
re.search(r'.*^string .*=.*', temp)
stop_time = time.time() - start_time
print "Execution time is : %s seconds" % stop_time
Average result of this is ~414 seconds (around 6 to 7 minutes). Is there anyway I can reduce this to let say, around ~2 minutes or less? Based on other's people feedback here, splitting this long line to list of strings will not produce any significant impact in terms of execution time. Any ideas are greatly appreciated. Thanks in advance!
re.search
already goes character by character, starting your pattern with .*
will just mean that it will always match and every character of the large string can be a candidate... you need to improve your regular expression, or use re.match
instead of re.search
.
Also - You are using ^
in the wrong place I believe, it can either signify the start of a newline, (in which case you need to pass the multiline flag re.MULTILINE
to the compiler/regex) Or it means "not" when used in character set.
You should change your regex to something like this:
r'string [^=]*=.*'
This says, look for the word "string" followed by a space, then any number of characters that are not =
then =
then anything. Also - You might want to use +
instead of *
because *
can also mean 0 matches, where +
requires at least 1 character.
But without any more information on your end - it will be hard to tell what exactly is needed.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.