简体   繁体   English

python正则表达式替换匹配字符串的一部分

[英]python regular expression replacing part of a matched string

i got an string that might look like this我有一个看起来像这样的字符串

"myFunc('element','node','elementVersion','ext',12,0,0)"

i'm currently checking for validity using, which works fine我目前正在检查使用的有效性,效果很好

myFunc\((.+?)\,(.+?)\,(.+?)\,(.+?)\,(.+?)\,(.+?)\,(.+?)\)

now i'd like to replace whatever string is at the 3rd parameter.现在我想替换第三个参数中的任何字符串。 unfortunately i cant just use a stringreplace on whatever sub-string on the 3rd position since the same 'sub-string' could be anywhere else in that string.不幸的是,我不能只在第三个位置的任何子字符串上使用 stringreplace,因为相同的“子字符串”可能位于该字符串中的任何其他位置。

with this and a re.findall,有了这个和一个 re.findall,

myFunc\(.+?\,.+?\,(.+?)\,.+?\,.+?\,.+?\,.+?\)

i was able to get the contents of the substring on the 3rd position, but re.sub does not replace the string it just returns me the string i want to replace with :/我能够在第 3 个位置获取子字符串的内容,但是 re.sub 不替换字符串它只是返回我想替换的字符串:/

here's my code这是我的代码

myRe = re.compile(r"myFunc\(.+?\,.+?\,(.+?)\,.+?\,.+?\,.+?\,.+?\)")
val =   "myFunc('element','node','elementVersion','ext',12,0,0)"

print myRe.findall(val)
print myRe.sub("noVersion",val)

any idea what i've missed ?知道我错过了什么吗?

thanks!谢谢! Seb塞伯

In re.sub, you need to specify a substitution for the whole matching string.在 re.sub 中,您需要为整个匹配字符串指定替换。 That means that you need to repeat the parts that you don't want to replace.这意味着您需要重复您不想更换的零件。 This works:这有效:

myRe = re.compile(r"(myFunc\(.+?\,.+?\,)(.+?)(\,.+?\,.+?\,.+?\,.+?\))")
print myRe.sub(r'\1"noversion"\3', val)

If your only tool is a hammer, all problems look like nails.如果您唯一的工具是锤子,那么所有问题看起来都像钉子。 A regular expression is a powerfull hammer but is not the best tool for every task.正则表达式是一把强大的锤子,但不是每项任务的最佳工具。

Some tasks are better handled by a parser.有些任务最好由解析器处理。 In this case the argument list in the string is just like a Python tuple, sou you can cheat: use the Python builtin parser:在这种情况下,字符串中的参数列表就像一个 Python 元组,所以你可以作弊:使用 Python 内置解析器:

>>> strdata = "myFunc('element','node','elementVersion','ext',12,0,0)"
>>> args = re.search(r'\(([^\)]+)\)', strdata).group(1)
>>> eval(args)
('element', 'node', 'elementVersion', 'ext', 12, 0, 0)

If you can't trust the input ast.literal_eval is safer than eval for this.如果你不能相信输入ast.literal_eval比 eval 更安全。 Once you have the argument list in the string decontructed I think you can figure out how to manipulate and reassemble it again, if needed.一旦你解构了字符串中的参数列表,我想你可以弄清楚如何在需要时再次操作和重新组装它。

Read the documentation: re.sub returns a copy of the string where every occurrence of the entire pattern is replaced with the replacement.阅读文档: re.sub返回字符串的副本,其中每个出现的整个模式都被替换。 It cannot in any case modify the original string, because Python strings are immutable.它在任何情况下都不能修改原始字符串,因为 Python 字符串是不可变的。

Try using look-ahead and look-behind assertions to construct a regex that only matches the element itself:尝试使用前瞻和后视断言来构造一个只匹配元素本身的正则表达式:

myRe = re.compile(r"(?<=myFunc\(.+?\,.+?\,)(.+?)(?=\,.+?\,.+?\,.+?\,.+?\))")

Have you tried using named groups?您是否尝试过使用命名组? http://docs.python.org/howto/regex.html#search-and-replace http://docs.python.org/howto/regex.html#search-and-replace

Hopefully that will let you just target the 3rd match.希望这能让您只针对第 3 场比赛。

If you want to do this without using regex:如果您想在不使用正则表达式的情况下执行此操作:

>>> s = "myFunc('element','node','elementVersion','ext',12,0,0)"
>>> l = s.split(",")
>>> l[2]="'noVersion'"
>>> s = ",".join(l)
>>> s
"myFunc('element','node','noVersion','ext',12,0,0)"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM