简体   繁体   中英

How to split a string by a substring without white spaces, while keeping its original white spaces?

I am looking for a way to split a string with white spaces (this includes spaces, \n, \t) by a target phrase that has its white spaces removed. This should be able to be done for both before and after the target phrase. It is also necessary that I keep the original string with its white spaces.

As there can be n-th occurrence of the target phrase, I am only looking to split by the first occurrence and getting the characters before it, and split by the last occurrence and getting the characters after it.

For example:

str = 'This is a test string for my test string example only.'
target_phrase = 'teststring'

Intended output:

('This is a', 'test string for my test string example only.) #Split by target phrase and getting characters prior to it
('This is a test string for my test string', 'example only.') #Split by target phrase and getting characters after it

emphasized text

Any hints gratefully received.

Is this acceptable (it doesn't bother to handle the case when the target phrase is not found):

# Splits str at the first occurrence of targ, ignoring spaces in both.
# Returns tuple of substrings produced by the split.
def my_split(str, targ):
    idx = str.replace(' ', '').index(targ)

    # Next, in the original string that has spaces,
    # we count the number of spaces and non-spaces, until
    # the number of non-spaces reaches idx. When that happens,
    # it means we have reached the split-point in the original
    # string that has spaces.
    non_space = 0
    space = 0
    while (non_space < idx) and ((non_space+space) < len(str)):
        if str[space+non_space] == ' ':
            space += 1
        else:
            non_space += 1
    if (space + non_space):
        return (str[:space+non_space], str[1+space+non_space:])
    else:
        return ('', str)

Usage:

print (my_split(str, target_phrase))
print (tuple(s[::-1] for s in my_split(str[::-1], target_phrase[::-1]))[::-1])

Output:

('This is a', 'test string for my test string example only.')
('This is a test string for my test string', 'example only.')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM