[英]Remove character from backrefrence after using re.sub
I have a string with a list of usernames mentioned for instance:我有一个包含用户名列表的字符串,例如:
s = '@romeo went to @juliet and said hi, I'm @romeo'
I want to replace that username patter with links to user profile which should become <a href="/u/username">@username</a>
I am now able to replace the patterns, however, I cannot seem to get rid of the @
in the href
with using backreferences.我想用指向用户配置文件的链接替换该用户名模式,该链接应该成为
<a href="/u/username">@username</a>
我现在可以替换模式,但是,我似乎无法摆脱href
的@
使用反向引用。
print(re.sub(r"(^|[^@\\w])@(\\w{1,31})", r'<a href="/u/\\g<0>">\\g<0></a>', s))
That right now prints:现在打印:
<a href="/u/@romeo">@romeo</a> went to<a href="/u/ @juliet"> @juliet</a> and said hi, Im<a href="/u/ @romeo"> @romeo</a>
Which now you can see the extra space and @
I cannot seem to get rid of that after using the regex现在你可以看到额外的空间,
@
我在使用正则表达式后似乎无法摆脱它
You need to use你需要使用
print(re.sub(r"\B(?<!@)@(\w{1,31})", r'<a href="/u/\1">\g<0></a>', s))
See the Python demo and the regex demo .请参阅Python 演示和正则表达式演示。
Regex正则表达式
\\B@
- @
char either at the start of string, or when immediately preceded with a non-word char \\B@
- @
char 位于字符串的开头,或者紧接在非单词字符之前(?<!@)
- the preceding @
should not be immediately preceded with @
(?<!@)
- 前面的@
不应该紧跟在@
之前(\\w{1,31})
- Capturing group 1 ( \\1
): one to thirty-one word chars. (\\w{1,31})
- 捕获组 1 ( \\1
):一到三十一个字字符。 The \\1
in r'<a href="/u/\\1">\\g<0></a>'
stands for the Group 1 value. r'<a href="/u/\\1">\\g<0></a>'
的\\1
代表 Group 1 值。 \\g<0>
stands for the whole match. \\g<0>
代表整场比赛。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.