In many programming languages, the following
find foo([az]+)bar
and replace with GOO\\U\\1GAR
will result in the entire match being made uppercase. I can't seem to find the equivalent in python; does it exist?
You can pass a function to re.sub()
that will allow you to do this, here is an example:
def upper_repl(match):
return 'GOO' + match.group(1).upper() + 'GAR'
And an example of using it:
>>> re.sub(r'foo([a-z]+)bar', upper_repl, 'foobazbar')
'GOOBAZGAR'
Do you mean something like this?
>>>x = "foo spam bar"
>>>re.sub(r'foo ([a-z]+) bar', lambda match: r'foo {} bar'.format(match.group(1).upper()), x)
'foo SPAM bar'
For reference, here's the docstring of re.sub
(emphasis mine).
Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable ; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.
If you already have a replacement string (template), you may not be keen on swapping it out with the verbosity of m.group(1)+...+m.group(2)+...+m.group(3)
... Sometimes it's nice to have a tidy little string.
You can use the MatchObject
's expand() function to evaluate a template for the match in the same manner as sub() , allowing you to retain as much of your original template as possible. You can use upper
on the relevant pieces.
re.sub(r'foo([a-z]+)bar', lambda m: 'GOO' + m.expand('\1GAR').upper())
While this would not be particularly useful in the example above, and while it does not aid with complex circumstances, it may be more convenient for longer expressions with a greater number of captured groups, such as a MAC address censoring regex, where you just want to ensure the full replacement is capitalized or not.
You could use some variation of this:
s = 'foohellobar'
def replfunc(m):
return m.groups()[0]+m.groups()[1].upper()+m.groups()[2]
re.sub('(foo)([a-z]+)(bar)',replfunc,s)
gives the output:
'fooHELLObar'
For those coming across this on google...
You can also use re.sub to match repeating patterns. For example, you can convert a string with spaces to camelCase:
def to_camelcase(string):
string = string[0].lower() + string[1:] # lowercase first
return re.sub(
r'[\s]+(?P<first>[a-z])', # match spaces followed by \w
lambda m: m.group('first').upper(), # get following \w and upper()
string)
to_camelcase('String to convert') # --> stringToConvert
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.