简体   繁体   中英

python: re.sub's replace function doesn't accept extra arguments - how to avoid global variable?

I'm trying to increment all timestamps (of the form 'HH:MM:SS') in a text file by a number of seconds specified by a command-line parameter to my program.

Here's a simplified version of my effort so far:

import re
from datetime import datetime, timedelta

time_diff = timedelta(seconds=10)

def replace_time(matchobj):
    if matchobj.group(1) not in [None, '']:
       return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")

print re.sub('(\d\d:\d\d:\d\d)', replace_time, "01:27:55")

This works fine: the result of running this is 01:28:05 which is just what I want.

However, I've heard that I should use global variables as less as possible. So I was wondering if there's a simple way to pass time_diff as an argument to replace_time instead of using a global variable.

I tried the obvious, but it failed:

def replace_time(matchobj, time_diff):
    if matchobj.group(1) not in [None, '']:
       return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")

time_diff = timedelta(seconds=10)
print re.sub('(\d\d:\d\d:\d\d)', replace_time(matchobj, time_diff), "01:27:55")

with this error: NameError: name 'matchobj' is not defined , so I can't pass matchobj directly.

I've looked at the standard re page and standard re howto , but can't find the information I need over there. How can I avoid using a global variable here? Can I somehow pass an extra argument to the replace_time function? Thanks in advance.

You can wrap a function in a closure like this:

def increment_by(time_diff):
    def replace_time(matchobj):
        if matchobj.group(1) not in [None, '']:
            return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")
    return replace_time

time_diff = timedelta(seconds=10)
print re.sub('(\d\d:\d\d:\d\d)', increment_by(time_diff), "01:27:55")

Or you can use a partial from stdlib like this:

from functools import partial

def replace_time(time_diff, matchobj):
    if matchobj.group(1) not in [None, '']:
        return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")

time_diff = timedelta(seconds=10)
print re.sub('(\d\d:\d\d:\d\d)', partial(replace_time, time_diff), "01:27:55")

There is nothing wrong with your current approach. time_diff is written to once only and then all future accesses are reads. It effect it is a module wide constant.

You run into problems with shared global state when you have multiple threads accessing an object and at least one of the threads is writing. That's not happening here and you have nothing to be concerned about.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM