简体   繁体   English

python:re.sub的replace函数不接受额外的参数 - 如何避免全局变量?

[英]python: re.sub's replace function doesn't accept extra arguments - how to avoid global variable?

I'm trying to increment all timestamps (of the form 'HH:MM:SS') in a text file by a number of seconds specified by a command-line parameter to my program. 我正在尝试将文本文件中的所有时间戳(形式为'HH:MM:SS')增加一个命令行参数指定给我程序的秒数。

Here's a simplified version of my effort so far: 这是迄今为止我的努力的简化版本:

import re
from datetime import datetime, timedelta

time_diff = timedelta(seconds=10)

def replace_time(matchobj):
    if matchobj.group(1) not in [None, '']:
       return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")

print re.sub('(\d\d:\d\d:\d\d)', replace_time, "01:27:55")

This works fine: the result of running this is 01:28:05 which is just what I want. 这很好用:运行它的结果是01:28:05这正是我想要的。

However, I've heard that I should use global variables as less as possible. 但是,我听说我应该尽可能少地使用全局变量。 So I was wondering if there's a simple way to pass time_diff as an argument to replace_time instead of using a global variable. 所以我想知道是否有一种简单的方法可以将time_diff作为参数传递给replace_time而不是使用全局变量。

I tried the obvious, but it failed: 我尝试了显而易见的,但它失败了:

def replace_time(matchobj, time_diff):
    if matchobj.group(1) not in [None, '']:
       return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")

time_diff = timedelta(seconds=10)
print re.sub('(\d\d:\d\d:\d\d)', replace_time(matchobj, time_diff), "01:27:55")

with this error: NameError: name 'matchobj' is not defined , so I can't pass matchobj directly. 出现此错误: NameError: name 'matchobj' is not defined ,因此我无法直接传递matchobj。

I've looked at the standard re page and standard re howto , but can't find the information I need over there. 我看过标准的重新页面标准的re howto ,但找不到我需要的信息。 How can I avoid using a global variable here? 我怎样才能避免在这里使用全局变量? Can I somehow pass an extra argument to the replace_time function? 我可以以某种方式将额外的参数传递给replace_time函数吗? Thanks in advance. 提前致谢。

You can wrap a function in a closure like this: 你可以在一个闭包中包装一个函数,如下所示:

def increment_by(time_diff):
    def replace_time(matchobj):
        if matchobj.group(1) not in [None, '']:
            return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")
    return replace_time

time_diff = timedelta(seconds=10)
print re.sub('(\d\d:\d\d:\d\d)', increment_by(time_diff), "01:27:55")

Or you can use a partial from stdlib like this: 或者您可以像这样使用stdlib中的partial

from functools import partial

def replace_time(time_diff, matchobj):
    if matchobj.group(1) not in [None, '']:
        return (datetime.strptime(matchobj.group(1), "%H:%M:%S") + time_diff).strftime("%H:%M:%S")

time_diff = timedelta(seconds=10)
print re.sub('(\d\d:\d\d:\d\d)', partial(replace_time, time_diff), "01:27:55")

There is nothing wrong with your current approach. 您当前的方法没有任何问题。 time_diff is written to once only and then all future accesses are reads. time_diff只写入一次,然后所有将来的访问都是读取。 It effect it is a module wide constant. 它影响它是一个模块宽常数。

You run into problems with shared global state when you have multiple threads accessing an object and at least one of the threads is writing. 当您有多个线程访问对象并且至少有一个线程在写时,您会遇到共享全局状态的问题。 That's not happening here and you have nothing to be concerned about. 这不是在这里发生的,你没有什么可担心的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM