Using split function in python3.5

Question

Trying to split the string at number 7 and I want 7 to be included in the second part of the split string.

Code:

a = 'cats can jump up to 7 times their tail length'

words = a.split("7")

print(words)

Output:

['cats can jump up to ', ' times their tail length']

String got split but second part doesn't include 7.

I want to know how I can include 7.

note: not a duplicate of Python split() without removing the delimiter because the separator has to be part of the second string.

Answer 1

A simple and naive way to do this is just to find the index of what you want to split on and slice it:

>>> a = 'cats can jump up to 7 times their tail length'
>>> ind = a.index('7')
>>> a[:ind], a[ind:]
('cats can jump up to ', '7 times their tail length')

Answer 2

Another way is to use str.partition :

a = 'cats can jump up to 7 times their tail length'
print(a.partition('7'))
# ('cats can jump up to ', '7', ' times their tail length')

To join the number again with the latter part you can use str.join :

x, *y = a.partition('7')
y = ''.join(y)
print((x, y))
# ('cats can jump up to ', '7 times their tail length')

Or do it manually:

sep = '7'
x = a.split(sep)
x[1] = sep + x[1]
print(tuple(x))
# ('cats can jump up to ', '7 times their tail length')

Answer 3

in one line, using re.split with the rest of the string, and filter the last, empty string that re.split leaves:

import re
a = 'cats can jump up to 7 times their tail length'
print([x for x in re.split("(7.*)",a) if x])

result:

['cats can jump up to ', '7 times their tail length']

using () in split regex tells re.split not to discard the separator. A (7) regex would have worked but would have created a 3-item list like str.partition does, and would have required some post processing, so no one-liner.

now if the number isn't known, regex is (again) the best way to do it. Just change the code to:

[x for x in re.split("(\d.*)",a) if x]

Answer 4

re can be used to capture globally as well:

>>> s = 'The 7 quick brown foxes jumped 7 times over 7 lazy dogs'
>>> sep = '7'
>>> 
>>> [i for i in re.split(f'({sep}[^{sep}]*)', s) if i]
['The ', '7 quick brown foxes jumped ', '7 times over ', '7 lazy dogs']

If the f-string is hard to read, note that it just evaluates to (7[^7]*) .
(To the same end as the listcomp one can use list(filter(bool, ...)) , but it's comparatively quite ugly)

In Python 3.7 and onward, re.split() allows splitting on zero-width patterns. This means a lookahead regex, namely f'(?={sep})' , can be used instead of the group shown above.

What's strange about this is the timings: if using re.split() (ie without a compiled pattern object), the group solution consistently runs about 1.5x faster than the lookahead. However, when compiled, the lookahead beats the other hands-down:

In [4]: r_lookahead = re.compile('f(?={sep})')

In [5]: r_group = re.compile(f'({sep}[^{sep}]*)')

In [6]: %timeit [i for i in r_lookahead.split(s) if i]
2.76 µs ± 207 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: %timeit [i for i in r_group.split(s) if i]
5.74 µs ± 65.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [8]: %timeit [i for i in r_lookahead.split(s * 512) if i]
137 µs ± 1.93 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [9]: %timeit [i for i in r_group.split(s * 512) if i]
1.88 ms ± 18.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

A recursive solution also works fine, although more slowly than splitting on a compiled regex (but faster than a straight re.split(...) ):

def splitkeep(s, sep, prefix=''):
    start, delim, end = s.partition(sep)
    return [prefix + start, *(end and splitkeep(end, sep, delim))]

>>> s = 'The 7 quick brown foxes jumped 7 times over 7 lazy dogs'
>>> 
>>> splitkeep(s, '7')
['The ', '7 quick brown foxes jumped ', '7 times over ', '7 lazy dogs']

Answer 5

Using enumerate, This only works if the string doesnt start with the seperator

s = 'The quick 7 the brown foxes jumped 7 times over 7 lazy dogs'

separator = '7'
splitted = s.split(separator)

res = [((separator if i > 0 else '') + item).strip() for i, item in enumerate(splitted)]

print(res)

['The quick', '7 the brown foxes jumped', '7 times over', '7 lazy dogs']

[Program finished]

Answer 6

There's also the possibility to do all of it using split and list comprehension, without the need to import any library. This will, however, make your code slightly "less pretty":

a = 'cats can jump up to 7 times their tail length'
sep = '7'
splitString = a.split(sep)
splitString = list(splitString[0]) + [sep+x for x in splitString[1:]]

And with that, splitString will carry the value:

['cats can jump up to ', '7 times their tail length']

Using split function in python3.5

Question

6 answers

solution1
8 2018-02-23 15:10:38

solution2
5 2018-02-23 15:13:57

solution3
5 ACCPTED 2018-02-23 15:23:31

solution4
1 2018-02-23 17:02:25

solution5
0 2021-03-31 13:50:37

solution6
0 2021-03-31 14:10:00

Using split function in python3.5

Question

6 answers

solution1 8 2018-02-23 15:10:38

solution2 5 2018-02-23 15:13:57

solution3 5 ACCPTED 2018-02-23 15:23:31

solution4 1 2018-02-23 17:02:25

solution5 0 2021-03-31 13:50:37

solution6 0 2021-03-31 14:10:00

solution1
8 2018-02-23 15:10:38

solution2
5 2018-02-23 15:13:57

solution3
5 ACCPTED 2018-02-23 15:23:31

solution4
1 2018-02-23 17:02:25

solution5
0 2021-03-31 13:50:37

solution6
0 2021-03-31 14:10:00