简体   繁体   中英

python regular expression replacing part of a matched string

i got an string that might look like this

"myFunc('element','node','elementVersion','ext',12,0,0)"

i'm currently checking for validity using, which works fine

myFunc\((.+?)\,(.+?)\,(.+?)\,(.+?)\,(.+?)\,(.+?)\,(.+?)\)

now i'd like to replace whatever string is at the 3rd parameter. unfortunately i cant just use a stringreplace on whatever sub-string on the 3rd position since the same 'sub-string' could be anywhere else in that string.

with this and a re.findall,

myFunc\(.+?\,.+?\,(.+?)\,.+?\,.+?\,.+?\,.+?\)

i was able to get the contents of the substring on the 3rd position, but re.sub does not replace the string it just returns me the string i want to replace with :/

here's my code

myRe = re.compile(r"myFunc\(.+?\,.+?\,(.+?)\,.+?\,.+?\,.+?\,.+?\)")
val =   "myFunc('element','node','elementVersion','ext',12,0,0)"

print myRe.findall(val)
print myRe.sub("noVersion",val)

any idea what i've missed ?

thanks! Seb

In re.sub, you need to specify a substitution for the whole matching string. That means that you need to repeat the parts that you don't want to replace. This works:

myRe = re.compile(r"(myFunc\(.+?\,.+?\,)(.+?)(\,.+?\,.+?\,.+?\,.+?\))")
print myRe.sub(r'\1"noversion"\3', val)

If your only tool is a hammer, all problems look like nails. A regular expression is a powerfull hammer but is not the best tool for every task.

Some tasks are better handled by a parser. In this case the argument list in the string is just like a Python tuple, sou you can cheat: use the Python builtin parser:

>>> strdata = "myFunc('element','node','elementVersion','ext',12,0,0)"
>>> args = re.search(r'\(([^\)]+)\)', strdata).group(1)
>>> eval(args)
('element', 'node', 'elementVersion', 'ext', 12, 0, 0)

If you can't trust the input ast.literal_eval is safer than eval for this. Once you have the argument list in the string decontructed I think you can figure out how to manipulate and reassemble it again, if needed.

Read the documentation: re.sub returns a copy of the string where every occurrence of the entire pattern is replaced with the replacement. It cannot in any case modify the original string, because Python strings are immutable.

Try using look-ahead and look-behind assertions to construct a regex that only matches the element itself:

myRe = re.compile(r"(?<=myFunc\(.+?\,.+?\,)(.+?)(?=\,.+?\,.+?\,.+?\,.+?\))")

Have you tried using named groups? http://docs.python.org/howto/regex.html#search-and-replace

Hopefully that will let you just target the 3rd match.

If you want to do this without using regex:

>>> s = "myFunc('element','node','elementVersion','ext',12,0,0)"
>>> l = s.split(",")
>>> l[2]="'noVersion'"
>>> s = ",".join(l)
>>> s
"myFunc('element','node','noVersion','ext',12,0,0)"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM