简体   繁体   中英

Sub the Nth Occurrence In a File Using Regex

i have an input file that i am reading and trying to find the 3rd occurrence of the letter 'K' and change it to an 'L'. I read on here that you can just put the number of letters to skip as a parameter inside .sub. I put a '3' but doesnt work. It still changes the first 2 K's.

Example:

Input:

1fKKvg   K 21000000001
1fKKvg   K 34210887632

Expected Output:

1fKKvg K 21000345678
1fKKvg L 34210887632

Code:

with open(file, 'r') as file:
    with open(dir+'wupannew2.txt', 'w') as fout:
        for f in file:
            if re.search(r'\b210',f):
                rflag = re.sub('L', 'K', f)
                fout.write(rflag)
                print(rflag.split())
            if not re.search('210', f):
                rflag = re.sub('K', 'L', f, 3)
                fout.write(rflag)
                print(rflag.split())

To change K to L the third occurrence you can use re.sub and look ahead like this:

pattern = re.compile(r"K(?=\s3)")
re.sub(pattern, 'L', yourtext_variable)

this regex means:

  1. K -> search K character
  2. (?=\\s3) -> look before space and 3 character
  3. re.sub(pattern, 'L', yourtext_variable) -> replace the K with L

If you want to remove 2 space before the K character than you can applying 2 re.sub like this:

with open('wupan.txt', 'r') as file:
    with open(dir+'wupannew2.txt', 'w') as fout:
        for f in file:
            tmp = re.sub(r"\s{2}(?=K)", "", f)
            fout.write(re.sub(r"K(?=\s3)", 'L', tmp))

As for the python documentation :

The optional argument count is the maximum number of pattern occurrences to be replaced;

It will not skip the first ones, only stop after changing 3 K.

You may want to change K when surrounded by whitespaces, if that's always the case in your input.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM