正则表达式以匹配两个不同分隔符（例如“ <”和“>”）之间的字符串

Question

I want to split my text into contexts and captures where I have rules like: 我想将文本拆分为上下文并捕获具有以下规则的地方：

<abc> e f <ghi>
<abc> e f
e f <ghi>

Here I want to create rules that only affect strings inside of the markers <> and example would be the output: 在这里，我想创建仅影响标记<>内的字符串的规则，示例将为输出：

<aabbcc> e f <gghhii>
<axyzbxyzcxyz> e f 
e f <g_h_i_>

using line.split('')[i] doesn't cut it because I have two different separators 使用line.split('')[i]不会剪切它，因为我有两个不同的分隔符

Answer 1

You can use re.sub to replace the parts within <...> , using a replacement callback function: 您可以使用re.sub通过替换回调函数替换<...>的零件：

def replace_function(match):
    return '<' + ''.join(c + c for c in match.group(1)) + '>'

text = re.sub(r"<(.*?)>", replace_function, text)

This will duplicate the chars in each of the tags, but you can extend the function any way you want to perform more complex substitutions. 这将在每个标签中复制字符，但是您可以以任何想要执行更复杂替换的方式扩展该功能。

正则表达式以匹配两个不同分隔符（例如“ <”和“>”）之间的字符串

问题描述

1 个解决方案

解决方案1
1 2015-10-28 14:44:41

正则表达式以匹配两个不同分隔符（例如“ &lt;”和“&gt;”）之间的字符串

问题描述

1 个解决方案

解决方案1 1 2015-10-28 14:44:41

正则表达式以匹配两个不同分隔符（例如“ <”和“>”）之间的字符串

解决方案1
1 2015-10-28 14:44:41