[英]Regular expression to replace a substring within a String in python
I have to replace a substring with another string in a file.我必须用文件中的另一个字符串替换 substring 。
Below is the line which is present in the file.以下是文件中存在的行。
Input: #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */
输入:
#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */
Expected Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]
预期 Output:
#pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]
Below is my code:下面是我的代码:
import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
comment = re.search(r'([\/\*]+).*([^\*\/]+)', line)
replace = re.search(r'([\[[]).*([\]]])', comment.group())
replaceWith = replace.group()
content_new = re.sub(r'([\/\*]).*([\*\/])', '# ' + replaceWith, line)
Is there an optimal solution for the above code?上述代码是否有最佳解决方案?
You need to match comments, say, with Regex to match a C-style multiline comment , and then replace the [[...]]
substring inside the matches.例如,您需要使用Regex 匹配注释以匹配 C 样式的多行注释,然后替换匹配中的
[[...]]
substring。 This approach is safest, it won't fail if there is [[
and no ]]
inside the comment, and there are several such comments in the string.这种方式是最安全的,如果注释中有
[[
而没有]]
就不会失败,并且字符串中有几个这样的注释。
The sample code snippet will look like示例代码片段看起来像
import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
content_new = re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()), line)
print(content_new)
Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]
. Output:
#pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]
。
Details:细节:
re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line)
- finds all C style comments in the string, and re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line)
- 查找所有 C 样式注释在字符串中,并且lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group())
is the replacement: x
is the match data object with the comment text, it matches any text up to [[
, then captures [[
and then any text up to and including ]]
, and then matches the rest of the comment, and replaces with #
, space and the Group 1 value. lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group())
是替换: x
是匹配数据 object 与注释文本,它匹配直到[[
的任何文本,然后捕获[[
以及直到并包括]]
的任何文本,然后匹配注释的 rest,并替换为#
、空格和 Group 1 值。 See the regex demo here .
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.