[英]How to match and replace undefined numbers of a pattern using re.sub
I wish to match a pattern in some text which occurs 0-n times and replace the text when this happens.我希望匹配某些文本中出现 0-n 次的模式,并在发生这种情况时替换文本。
Here is some sample text这是一些示例文本
XYZ
WWW
OOO
|NOTE:
ABC
DEF
GHI
3+|
HERE
I want to convert the above text to the following (I only wish to convert the part between "|NOTE:" and "3+|"):我想将上述文本转换为以下内容(我只想转换“|NOTE:”和“3+|”之间的部分):
XYZ
WWW
OOO
|NOTE:ABCDEFGHI
3+|
HERE
Where the text above is contained in "input_txt", I can do it with the following code:如果上面的文本包含在“input_txt”中,我可以使用以下代码来完成:
input_txt = re.sub(
r'\|(NOTE):\n*(.*)\n*(.*)\n*(.*)(\n*[0-9]*[\+]*[\|]*)',
r'|\1:\2\3\4\5',
input_txt
)
However, this code only works if there are three \n separated paragraphs after the "|NOTE:" text.但是,只有在“|NOTE:”文本之后有三个 \n 分隔的段落时,此代码才有效。 How do I change the so that it will match and replace any number of \n characters?
如何更改它以匹配和替换任意数量的 \n 字符? I would prefer to do this with re.sub if possible (for my own interest, as this is an issue I have come across before without knowing how to do it), but would also be open to other suggestions of how it might better be done.
如果可能的话,我宁愿用 re.sub 来做这个(为了我自己的利益,因为这是我以前遇到过的一个问题,但不知道该怎么做),但也愿意接受其他关于如何更好的建议完毕。
Try:尝试:
input_txt = re.sub(r'\n+([^0-9|])', r'\1', input_txt)
Output: Output:
|NOTE:ABCDEFGHI
3+|
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.