简体   繁体   中英

How to match and replace undefined numbers of a pattern using re.sub

I wish to match a pattern in some text which occurs 0-n times and replace the text when this happens.

Here is some sample text

XYZ

WWW

OOO

|NOTE:

ABC

DEF

GHI

3+|

HERE

I want to convert the above text to the following (I only wish to convert the part between "|NOTE:" and "3+|"):

XYZ

WWW

OOO

|NOTE:ABCDEFGHI

3+|

HERE

Where the text above is contained in "input_txt", I can do it with the following code:

input_txt = re.sub(
    r'\|(NOTE):\n*(.*)\n*(.*)\n*(.*)(\n*[0-9]*[\+]*[\|]*)',
    r'|\1:\2\3\4\5',
    input_txt
    )

However, this code only works if there are three \n separated paragraphs after the "|NOTE:" text. How do I change the so that it will match and replace any number of \n characters? I would prefer to do this with re.sub if possible (for my own interest, as this is an issue I have come across before without knowing how to do it), but would also be open to other suggestions of how it might better be done.

Try:

input_txt = re.sub(r'\n+([^0-9|])', r'\1', input_txt)

Output:

|NOTE:ABCDEFGHI
3+|

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM