简体   繁体   English

在正则表达式替换上执行正则表达式?

[英]Perform regex on a regex substitution?

Sorry if I used any wrong words in the title, however this has been an issue that has been bugging me. 抱歉,如果我在标题中使用了错误的单词,但是这一直困扰着我。

I have some code to automatically link urls that are inside text. 我有一些代码可以自动链接文本内的网址。

r = re.compile(r'\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^%s\s]|/)))')
url_parsed_comment = r.sub(r'<a tabindex=-1 target="blank" href="\1">\1</a>', comment_text)

As you can see, I take the found URL and pass it (\\1) into the href and inside the <a> tag. 如您所见,我采用找到的URL,并将其(\\ 1)传递到href中的<a>标记内。

I would like to truncate the text inside the <a> tag. 我想截断<a>标记内的文本。 The equivalent of what I want would look like this in python: 我想要的等效内容在python中看起来像这样:

link = '<a href="' + url + '">' + url[:10] + '...</a>'

How do I accomplish this with my substituted regex variable? 如何使用替换的正则表达式变量来完成此操作? How do I limit the characters? 如何限制字符?

You can use a lambda in re.sub , example: 您可以在re.sub使用lambda ,例如:

>>> re.sub(r'([0-9]*)', lambda m: m.group(1)[:2], '---123456789---')
'---12---'

then, you may refer to \\1 by m.group(1) . 然后,可以通过m.group(1)引用\\1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM