简体   繁体   中英

How to replace/modify a pattern by regular expression in python?

Assume that I want to modify all patterns in a script, take one line as an example:

line = "assert Solution().oddEvenList(genNode([2,1,3,5,6,4,7])) == genNode([2,3,6,7,1,5,4]), 'Example 2'"

Notice that function genNode is taking List[int] as the parameter. What I want is to remove the List, and keep the all the integers in the list, so that the function is actually taking *nums as the parameters.

Expecting:

line = "assert Solution().oddEvenList(genNode(2,1,3,5,6,4,7)) == genNode(2,3,6,7,1,5,4), 'Example 2'"

I've come up with a re pattern

r"([g][e][n][N][o][d][e][(])([[][0-9\,\s]*[]])([)])"

but I am not sure how I could use this... I can't get re.sub to work as it requires me to replace with a fixed string.

How can I achieve my desired result?

You can do:

re.sub(r'(genNode\()\[([^]]+)\]', r'\1\2', line)
  • (genNode\\() matches genNode( and put it in captured group 1
  • \\[ matches literal [
  • ([^]]+) matches upto next ] , and put it in captured group 2
  • \\] matches literal ]

In the replacement, we've used the captured groups only ie dropped [ and ] .


You can get rid of the first captured group by using a zero-width positive lookbehind to match the portion before [ :

re.sub(r'(?<=genNode\()\[([^]]+)\]', r'\1', line)

Example:

In [444]: line = "assert Solution().oddEvenList(genNode([2,1,3,5,6,4,7])) == genNode([2,3,6,7,1,5,4]), 'Example 2'"                                                                                         

In [445]: re.sub(r'(genNode\()\[([^]]+)\]', r'\1\2', line)                                                                                                                                                  
Out[445]: "assert Solution().oddEvenList(genNode(2,1,3,5,6,4,7)) == genNode(2,3,6,7,1,5,4), 'Example 2'"

In [446]: re.sub(r'(?<=genNode\()\[([^]]+)\]', r'\1', line)                                                                                                                                                 
Out[446]: "assert Solution().oddEvenList(genNode(2,1,3,5,6,4,7)) == genNode(2,3,6,7,1,5,4), 'Example 2'"

FWIW, using typical non-greedy pattern .*? instead of [^]]+ would work as well:

re.sub(r'(?<=genNode\()\[(.*?)\]', r'\1', line)

Instead of writing [g][e][n][N][o][d][e][(] you could write getNode\\(

The current character class that you use [0-9\\,\\s]* matches 0+ times any of the listed which could also for example match only comma's and does not make sure that there are comma separated digits.

To match the comma delimiter integers, you could match 1+ digits with a repeating group to match a comma and 1+ digits.

At the end use a positive lookahead to assert for the closing parenthesis or capture it in group 3 and also use that in the replacement.

With this pattern use r'\\1\\2 as the replacement.

(genNode\()\[(\d+(?:,\d+)*)\](?=\))

Explanation

  • (genNode\\() Capture in group 1 matching genNode(
  • \\[ Match [
  • ( Capturing group 2
    • \\d+(?:,\\d+)* Match 1+ digits and repeat 0+ times a comma and 1+ digits (to also support a single digit)
  • ) Close group 2
  • \\] Match ]
  • (?=\\)) Positive lookahead, assert what is on the right is a closing parenthesis )

Python demo | Regex demo

For example

import re

regex = r"(genNode\()\[(\d+(?:,\d+)*)\](?=\))"
line = "assert Solution().oddEvenList(genNode([2,1,3,5,6,4,7])) == genNode([2,3,6,7,1,5,4]), 'Example 2'"
result = re.sub(regex, r"\1\2", line)

if result:
    print (result)

Result

assert Solution().oddEvenList(genNode(2,1,3,5,6,4,7)) == genNode(2,3,6,7,1,5,4), 'Example 2'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM