简体   繁体   中英

eliminate white spaces between words using regex in python

I want to eliminate white space between 2 words among sentence containing many words

My code looks like this:

import re
sentence = "open app store"
pattern = re.compile(r'\b([a-z]) (?=[a-z]\b)', re.I)
sentence = re.sub(pattern, r'\g<1>', sentence)
print(sentence)

output:

open app store

I want to remove white space between app and store. I want output like this "open appstore".

Note that app won't always come up with store , app can come along with some other word after it, eg app maker .

Let's have a look at your pattern : it matches a word boundary, then captures any ASCII letter into Group 1, then matches a space, and then asserts there is a single ASCII letter followed with a word boundary. So, it can match ab in My ab string , but not the app store .

Now, it seems your app value is static, after it you want to match 1 or more whitespaces only if there is another word following app . You may follow two strategies.

You may match app that is followed with whitespace(s) and a letter and then remove the whitespaces (see this Python demo ):

re.sub(r"\b(app)\s+([a-z])", r"\1\2", sentence, flags=re.I)

(also, see the regex demo ) or you may use the known words that follow app and only remove the spaces between them:

re.sub(r"\b(app)\s+(store|maker|market|etc)", r"\1\2", sentence, flags=re.I)

See another regex demo and another Python demo .

This could work for you.

>>> import re
>>> sentence = "this is an open app store and this is another open app store."
>>> pattern = re.compile(r'app[\s]store')
>>> replacement = 'appstore'
>>> result = re.sub(pattern, replacement, sentence)
>>> result
'this is an open appstore and this is another open appstore.'

Edit: You could use this function to eliminate whitespace(s) between any two words.

import re

def remove_spaces(text, word_one, word_two):
    """ Return text after removing whitespace(s) between two specific words.

    >>> remove_spaces("an app store app maker app    store", "app", "store")
    'an appstore, app maker, appstore'
    """

    pattern = re.compile(r'{}[\s]*{}'.format(word_one, word_two))    # zero or more spaces
    replacement = word_one + word_two
    result = re.sub(pattern, replacement, text)

    return result

Try This :

import re
sentence = "This is test"
pattern = re.compile(r'(.*)\b\s+(?=[a-z])', re.I | re.S)
sentence = re.sub(pattern, r'\1', sentence)
print(sentence)

Output : This istest

hope it works for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM