正则表达式替换 python 中字符串内的 substring

Question

I have to replace a substring with another string in a file.我必须用文件中的另一个字符串替换 substring 。

Below is the line which is present in the file.以下是文件中存在的行。

Input: #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */输入： #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */

Expected Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]预期 Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]

Below is my code:下面是我的代码：

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
comment = re.search(r'([\/\*]+).*([^\*\/]+)', line)
replace = re.search(r'([\[[]).*([\]]])', comment.group())
replaceWith = replace.group()
content_new = re.sub(r'([\/\*]).*([\*\/])', '# ' + replaceWith, line)

Is there an optimal solution for the above code?上述代码是否有最佳解决方案？

Answer 1

You need to match comments, say, with Regex to match a C-style multiline comment , and then replace the [[...]] substring inside the matches.例如，您需要使用Regex 匹配注释以匹配 C 样式的多行注释，然后替换匹配中的[[...]] substring。 This approach is safest, it won't fail if there is [[ and no ]] inside the comment, and there are several such comments in the string.这种方式是最安全的，如果注释中有[[而没有]]就不会失败，并且字符串中有几个这样的注释。

The sample code snippet will look like示例代码片段看起来像

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
content_new = re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()), line)
print(content_new)

Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]] . Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]] 。

Details:细节：

re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line) - finds all C style comments in the string, and re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line) - 查找所有 C 样式注释在字符串中，并且
lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()) is the replacement: x is the match data object with the comment text, it matches any text up to [[ , then captures [[ and then any text up to and including ]] , and then matches the rest of the comment, and replaces with # , space and the Group 1 value. lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group())是替换： x是匹配数据 object 与注释文本，它匹配直到[[的任何文本，然后捕获[[以及直到并包括]]的任何文本，然后匹配注释的 rest，并替换为# 、空格和 Group 1 值。 See the regex demo here .请参阅此处的正则表达式演示。

正则表达式替换 python 中字符串内的 substring

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-07-28 19:39:22

正则表达式替换 python 中字符串内的 substring

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-07-28 19:39:22

解决方案1
1 已采纳 2020-07-28 19:39:22