简体   繁体   中英

Python add a space before and after a search word if there isn't in a file

We are trying to add a space before and after a matched word from a list of search words for each line in a file, if there isn't a space.

Input: Hi This is Manager Sam. Hello, this is Director.Tom. How is your Day Mr.Manager Sam.

import re
f1=open('input.txt', 'r')
f2=open('outout.txt', 'w')
checkWords = ("Manager",Director)

for line in f1:
    for checkword in checkWords:
        line = re.sub(r'(\b${0}\b)'.format(checkword), r'\1 ', line)
    print(line)
    f2.write(line)
f1.close()
f2.close()

Expected Output: Hi This is Manager Sam. Hello, this is Director.Tom. How is your Day Mr. Manager Sam.

Maybe you can use (index of checkword - 1) and (index of checkword + word's length) to check if there is a space in there or not. Then you can use replace() accordingly.

It's not very neat but this gives you expected output:

import re

s = "Hi This is Manager Sam. Hello, this is Director.Tom. How is your Day Mr.Manager Sam."
words = ("Manager", "Director")


def add_spaces(string, words):

    for word in words:
        # pattern to match any non-space char before the word
        patt1 = re.compile('\S{}'.format(word))

        matches = re.findall(patt1, string)
        for match in matches:
            non_space_char = match[0]
            string = string.replace(match, '{} {}'.format(non_space_char, word))

        # pattern to match any non-space char after the word
        patt2 = re.compile('{}\S'.format(word))
        matches = re.findall(patt2, string)
        for match in matches:
            non_space_char = match[-1]
            string = string.replace(match, '{} {}'.format(word, non_space_char))

    return string


print(add_spaces(s, words))

Output:

'Hi This is Manager Sam. Hello, this is Director .Tom. How is your Day Mr. Manager Sam.'

Note that '\S' is a regex character to match any non-whitespace.

Edit: there's probably a neater way of doing it with re.sub ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM