简体   繁体   English

python re.sub():特殊字符的困难(也许?)

[英]python re.sub(): difficulty with special characters (maybe?)

I would like to use re.sub to turn this: 我想用re.sub来解决这个问题:

string = '\\(2 \\, e^{\\left(2 \\, x\\right)} \\sin\\left(2 \\, x\\right) + 2 \\, e^{\\left(2 \\, x\\right)} \\cos\\left(2 \\, x\\right)\\)'

into this: 进入这个:

'\\(2 \\, e^{2 \\, x} \\sin\\left(2 \\, x\\right) + 2 \\, e^{2 \\, x} \\cos\\left(2 \\, x\\right)\\)'

This is my best attempt, but it does not work: 这是我最好的尝试,但它不起作用:

re.sub(r'(?P<left-edge>e\^{\\left\()(?P<input>.*)(?P<right-edge>\\right\)})','e^{\g<input>}',string)

Note that <input> needs to handle an arbitrary expression, while <left-edge> and <right-edge> are fixed character strings. 

I am assuming it has to do with the special characters involved, but several dozen attempts demonstrate that it is beyond my expertise. 我假设它与涉及的特殊字符有关,但几十次尝试表明它超出了我的专业知识。

Backslashes in regular expressions must be escaped. 必须转义正则表达式中的反斜杠。 You used r'' so they do not have to be escaped as characters in Python string, but that's not enough for them to be interpreted as literal \\ chars in regexes. 你使用r''所以它们不必作为Python字符串中的字符进行转义,但这还不足以将它们解释为正则表达式中的字面\\字符。 Use double backslashes: 使用双反斜杠:

re.sub(r'(?Pe\^{\\left()(?P.*)(?P\\right)})','e\^{\\g}',string)

Were it not for r'' , they would have to be escaped twice , ie quadrupled, in order to satisfy both Python interpreter and regexp engine: 如果不是r'' ,它们必须被转义两次 ,即四倍,以满足Python解释器和regexp引擎:

    re.sub('(?Pe\\^{\\\\left()(?P.*)(?P\\\\right)})','e\\^{\\\\g}',string)

(Additionally, you also forgot to escape one of ^ carets. I corrected that in both of my examples.) (此外,你也忘了逃避其中一个^插入。我在两个例子中都纠正了这一点。)

.* matches too much: .*匹配太多:

print(re.sub(r'e\^{\\left\((.*?)\\right\)}', r'e^{\1}', s))

Output 产量

\(2 \, e^{2 \, x} \sin\left(2 \, x\right) + 2 \, e^{2 \, x} \cos\left(2 \, x\right)\)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM