Pandas str.split without stripping split pattern

Question

Example code:

In [1]: import pandas as pd

In [2]: serie = pd.Series(['this#is#a#test', 'another#test'])

In [3]: serie.str.split('#', expand=True)
Out[3]:
         0     1     2     3
0     this    is     a  test
1  another  test  None  None

Is it possible to split without stripping the split criteria string? Output of the above would be:

Out[3]:
         0     1     2     3
0     this   #is    #a #test
1  another #test  None  None

EDIT 1 : Real use case would be to keep matching pattern, for instance:

serie.str.split(r'\n\*\*\* [A-Z]+', expand=True)

And [AZ]+ are processing steps in my case, which i want to keep for further processing.

Answer 1

You could split by using a positive look ahead. So the split point will be the point just before the postivie look ahead expression.

import pandas as pd

serie = pd.Series(['this#is#a#test', 'another#test'])
print(serie.str.split('(?=#)', expand=True))

OUTPUT

         0      1     2      3
0     this    #is    #a  #test
1  another  #test  None   None

Answer 2

Try str.split('(#[az]+)', expand=True)

Ex:

serie = pd.Series(['this#is#a#test', 'another#test'])
print(serie.str.split('(#[a-z]+)', expand=True)

Answer 3

Just simply add it at each line:

In [1]: import pandas as pd

In [2]: serie = pd.Series(['this#is#a#test', 'another#test'])

In [3]: serie.str.split('#', expand=True) + '#'
Out[3]:
          0      1    2      3
0     this#    is#   a#  test#
1  another#  test#  NaN    NaN

In [4]: '#' + serie.str.split('#', expand=True)
Out[4]:
          0      1    2      3
0     #this    #is   #a  #test
1  #another  #test  NaN    NaN

Pandas str.split without stripping split pattern

Question

3 answers

solution1
5 ACCPTED 2019-07-31 11:04:31

solution2
4 2019-07-31 10:59:56

solution3
0 2019-07-31 10:58:21

Pandas str.split without stripping split pattern

Question

3 answers

solution1 5 ACCPTED 2019-07-31 11:04:31

solution2 4 2019-07-31 10:59:56

solution3 0 2019-07-31 10:58:21

solution1
5 ACCPTED 2019-07-31 11:04:31

solution2
4 2019-07-31 10:59:56

solution3
0 2019-07-31 10:58:21