简体   繁体   English

使用 re.sub 后从 backrefrence 中删除字符

[英]Remove character from backrefrence after using re.sub

I have a string with a list of usernames mentioned for instance:我有一个包含用户名列表的字符串,例如:

s = '@romeo went to @juliet and said hi, I'm @romeo'

I want to replace that username patter with links to user profile which should become <a href="/u/username">@username</a> I am now able to replace the patterns, however, I cannot seem to get rid of the @ in the href with using backreferences.我想用指向用户配置文件的链接替换该用户名模式,该链接应该成为<a href="/u/username">@username</a>我现在可以替换模式,但是,我似乎无法摆脱href@使用反向引用。

print(re.sub(r"(^|[^@\\w])@(\\w{1,31})", r'<a href="/u/\\g<0>">\\g<0></a>', s))

That right now prints:现在打印:

<a href="/u/@romeo">@romeo</a> went to<a href="/u/ @juliet"> @juliet</a> and said hi, Im<a href="/u/ @romeo"> @romeo</a>

Which now you can see the extra space and @ I cannot seem to get rid of that after using the regex现在你可以看到额外的空间, @我在使用正则表达式后似乎无法摆脱它

You need to use你需要使用

print(re.sub(r"\B(?<!@)@(\w{1,31})", r'<a href="/u/\1">\g<0></a>', s))

See the Python demo and the regex demo .请参阅Python 演示正则表达式演示

Regex正则表达式

  • \\B@ - @ char either at the start of string, or when immediately preceded with a non-word char \\B@ - @ char 位于字符串的开头,或者紧接在非单词字符之前
  • (?<!@) - the preceding @ should not be immediately preceded with @ (?<!@) - 前面的@不应该紧跟在@之前
  • (\\w{1,31}) - Capturing group 1 ( \\1 ): one to thirty-one word chars. (\\w{1,31}) - 捕获组 1 ( \\1 ):一到三十一个字字符。

The \\1 in r'<a href="/u/\\1">\\g<0></a>' stands for the Group 1 value. r'<a href="/u/\\1">\\g<0></a>'\\1代表 Group 1 值。 \\g<0> stands for the whole match. \\g<0>代表整场比赛。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM