带有变量的python re.sub

Question

Input text: 输入文本：

Ell &#233;s la v&#237;ctima que expia els nostres pecats, i no tan sols els nostres, sin&#243; els del m&#243;n sencer.

Expected output: 预期产量：

Ell és la víctima que expia els nostres pecats, i no tan sols els nostres, sinó els del món sencer.

Known facts: unichr(233)=é 已知事实：unichr（233）=é

for now i have 现在我有

re.sub('&#([^;]*);', r'unichr(int(\1))', inputtext, flags=re.UNICODE)

and of course is not working, don't know how to pass function on \\1 当然不起作用，不知道如何在\\1上传递函数

Any idea? 任何想法？

Answer 1

使用lambda函数：

re.sub('&#([^;]*);', lambda match: unichr(int(match.group(1))), t, flags=re.UNICODE)

Answer 2

Fortunately for you, re.sub accepts a function as an argument as well. 对您来说幸运的是， re.sub接受一个函数作为参数。 The function will recieve a "MatchObject" -- From there, you can get the matched groups by match.group(1), match.group(2) , etc. etc. The return value of the function will be the string to replace the matched group in the input text. 该函数将收到一个“ MatchObject” -从那里，您可以通过match.group(1), match.group(2)等获取匹配的组。该函数的返回值将是要替换的字符串输入文本中的匹配组。

def fn(match):
  return unichr(int(match.group(1)))

re.sub('&#([^;]*);', fn, inputtext, flags=re.UNICODE)

If you really want, you can inline this and use a lambda -- But I think lambda makes it harder to read in this case ¹ . 如果确实需要，可以内联它并使用lambda －但是我认为lambda使得在这种情况下¹更加难以阅读。

By the way, depending on your python version, there are better ways to un-escape html (as it will also handle the special escape sequences like '&' : 顺便说一下，根据您的python版本，有更好的方法来取消转义html（因为它还将处理特殊的转义序列，例如'&' ：

Python2.x Python2.x

>>> import HTMLParser
>>> s = 'Ell &#233;s la v&#237;ctima que expia els nostres pecats, i no tan sols els nostres, sin&#243; els del m&#243;n sencer.'
>>> print HTMLParser.HTMLParser().unescape(s)
Ell és la víctima que expia els nostres pecats, i no tan sols els nostres, sinó els del món sencer.

Python3.x Python3.x

>>> import html
>>> html.unescape(s)

reference 参考

^{¹ especially if you give fn a more sensible name ;-)} ^{^1，尤其是如果您给fn一个更明智的名称；-)}

带有变量的python re.sub

问题描述

2 个解决方案

解决方案1
5 2015-01-13 00:25:34

解决方案2
4 已采纳 2015-01-13 00:26:26

Python2.x Python2.x

Python3.x Python3.x

带有变量的python re.sub

问题描述

2 个解决方案

解决方案1 5 2015-01-13 00:25:34

解决方案2 4 已采纳 2015-01-13 00:26:26

Python2.x Python2.x

Python3.x Python3.x

解决方案1
5 2015-01-13 00:25:34

解决方案2
4 已采纳 2015-01-13 00:26:26