简体   繁体   English

正则表达式替换 python 中字符串内的 substring

[英]Regular expression to replace a substring within a String in python

I have to replace a substring with another string in a file.我必须用文件中的另一个字符串替换 substring 。

Below is the line which is present in the file.以下是文件中存在的行。

Input: #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */输入: #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */

Expected Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]预期 Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]

Below is my code:下面是我的代码:

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
comment = re.search(r'([\/\*]+).*([^\*\/]+)', line)
replace = re.search(r'([\[[]).*([\]]])', comment.group())
replaceWith = replace.group()
content_new = re.sub(r'([\/\*]).*([\*\/])', '# ' + replaceWith, line)

Is there an optimal solution for the above code?上述代码是否有最佳解决方案?

You need to match comments, say, with Regex to match a C-style multiline comment , and then replace the [[...]] substring inside the matches.例如,您需要使用Regex 匹配注释以匹配 C 样式的多行注释,然后替换匹配中的[[...]] substring。 This approach is safest, it won't fail if there is [[ and no ]] inside the comment, and there are several such comments in the string.这种方式是最安全的,如果注释中有[[而没有]]就不会失败,并且字符串中有几个这样的注释。

The sample code snippet will look like示例代码片段看起来像

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
content_new = re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()), line)
print(content_new)

Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]] . Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]

Details:细节:

  • re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line) - finds all C style comments in the string, and re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line) - 查找所有 C 样式注释在字符串中,并且
  • lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()) is the replacement: x is the match data object with the comment text, it matches any text up to [[ , then captures [[ and then any text up to and including ]] , and then matches the rest of the comment, and replaces with # , space and the Group 1 value. lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group())是替换: x是匹配数据 object 与注释文本,它匹配直到[[的任何文本,然后捕获[[以及直到并包括]]的任何文本,然后匹配注释的 rest,并替换为# 、空格和 Group 1 值。 See the regex demo here .请参阅此处的正则表达式演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM