Python re.sub behaves differently than re.findall

Question

I'm stumped. I'm coding Python 3.6.2, using PyCharm as my IDE. The following script fragment illustrates my problem:

def dosubst(m):
    return m.group() + "X"

line = r"set @message = formatmessage('%s %s', @arg1, @arg2);"
m = re.findall(r"@\w+\b", line, re.IGNORECASE)
print(m[0])  # prints "@message"
print(m[1])  # prints "@arg1"
print(m[2])  # prints "@arg2"

foo = re.sub(r"@\w+\b", dosubst, line, re.IGNORECASE)
print(foo)  # prints "set @messageX = formatmessage('%s %s', @arg1X, @arg2);"

You can see that re.findall finds three matches. However, re.sub only calls the dosubst function twice . If I change @message to message then re.sub still calls dosubst twice, but picks up @arg1 and @arg2 . Baffled. I thought it might be greedy vs. posessive, etc. but - changing @message to message and the resulting behavior negates that. Can anyone explain? I'm trying to do some basic text parsing of SQL to refactor message formatting for a large number of files. I use regexr.com to prototype most of the regex stuff I do and it also finds three occurrences of the pattern in the line. Thanks.

Answer 1

See the documentation . The fourth argument to re.sub is count , not flags . Since re.IGNORECASE happens to be 2, you are telling it to only do two substitutions. Instead, pass flags by keyword:

>>> re.sub(r"@\w+\b", dosubst, line, flags=re.IGNORECASE)
"set @messageX = formatmessage('%s %s', @arg1X, @arg2X);"

Answer 2

By giving the fourth argument count=0 . If you put the other positive numbers instead of 0 than it will replace the string exactly the same number of time.

foo = re.sub(r"@\w+\b", dosubst, line, 0, re.IGNORECASE)
print(foo)

output:

"set @MessageX = formatmessage('%s %s', @arg1X, @arg2X);"

Python re.sub behaves differently than re.findall

Question

2 answers

solution1
6 2017-08-08 02:00:56

solution2
0 2017-08-08 03:04:54

Python re.sub behaves differently than re.findall

Question

2 answers

solution1 6 2017-08-08 02:00:56

solution2 0 2017-08-08 03:04:54

solution1
6 2017-08-08 02:00:56

solution2
0 2017-08-08 03:04:54