简体   繁体   中英

how to print the text including keyword and after keyword from a text file?

I am trying to print the text after a specific string.

file.txt

I am: "eating", mango
I am: eating a pine apple; and mango

I am trying to write a code where it should search for a keyword am: and print the text in "". if there are no "" in a line after am: then I want to print till;(or simply say 3 words).

output.txt

I am: eating
I am: eating a pine apple

My work:

with open('input.txt', 'r') as f, open ("output.txt", 'w') as out_fh:
    for line in f:
        str = re.search(r'\bam: "([^"]+)"', line).group()[0]
        if str:
            out_fh.write(str)
        else:
            a = re.compile(r'am:((\w+){3}')
            out_fh.write(a)

Not sure where I am going wrong. Any help would be appreciated. Thank you

You may use a single regex to fetch the expected result:

rx = re.compile(r'^(I am:\s*)("[^"]*"|[^;]*)')

See the regex demo . The regex matches

  • ^ - start of a string
  • (I am: - start of Capturing group 1: I am: string
  • \s*) - 0+ whitespaces, end of capturing group 1
  • ("[^"]*"|[^;]*) - Capturing group 1: a " followed with any 0 or more chars other than " and then a " , or any 0+ chars other than ;

In you code, use it like this:

rx = re.compile(r'\bam:\s*("[^"]*"|[^;]*)')
with open('input.txt', 'r') as f, open ("output.txt", 'w') as out_fh:
    for line in f:
        m = rx.search(line)
        if m:
            out_fh.write( "{}{}".format(m.group(1), m.group(2).strip('"')) )

Note that .strip('"') will remove the leading and trailing " chars captured with the first alternative in Group 1.

See a Python demo :

import re
text = """I am: "eating", mango
I am: eating a pine apple; and mango"""
rx = re.compile(r'^(I am:\s*)("[^"]*"|[^;]*)')
for line in text.splitlines():
    m = rx.search(line)
    if m:
        print("{}{}".format(m.group(1), m.group(2).strip('"')))

Output:

I am: eating
I am: eating a pine apple

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM