How to match regex pattern and replace it with a matching group using Pandas?

Question

I have the following Pandas Series:

SC_S193_M7.CONTROLDAY10.EPI.P1_Stem
SC_S194_M7.CONTROLDAY10.EPI.P1_Goblet
SC_S102_M1.CONTROLDAY3.EPI2_Enterocyte
SC_S106_M1.CONTROLDAY3.EPI2_Goblet

I want to use regex to extract the string after the last underscore in each row of this series. I was able to come up with regex that match with the last string but note sure how to implement it in a pandas series method.

The regex I used to match the pattern and replace with the first matching group \\1 :

SC_S\\d{3}_M\\d\\.CONTROLDAY\\d{1,2}\\.EPI\\d?(?:\\.P\\d_|_)

I tried using .replace() as follows but that did not work out:

.replace('SC_S\\d{3}_M\\d\\.CONTROLDAY\\d{1,2}\\.EPI\\d?(?:\\.P\\d_|_)(\\w+)')

Any idea how to use Pandas series method to extract the last string before the underscore or find the matching pattern and replace it with the first group?

Answer 1

I think you can split it instead of using RegEx:

In [170]: s
Out[170]:
0       SC_S193_M7.CONTROLDAY10.EPI.P1_Stem
1     SC_S194_M7.CONTROLDAY10.EPI.P1_Goblet
2    SC_S102_M1.CONTROLDAY3.EPI2_Enterocyte
3        SC_S106_M1.CONTROLDAY3.EPI2_Goblet
Name: 0, dtype: object

In [171]: s.str.split('_').str[-1]
Out[171]:
0          Stem
1        Goblet
2    Enterocyte
3        Goblet
Name: 0, dtype: object

or better using rsplit(..., n=1) :

In [174]: s.str.rsplit('_', n=1).str[-1]
Out[174]:
0          Stem
1        Goblet
2    Enterocyte
3        Goblet
Name: 0, dtype: object

alternatively you can use .str.extract() :

In [177]: s.str.extract(r'.*_([^_]*)$', expand=False)
Out[177]:
0          Stem
1        Goblet
2    Enterocyte
3        Goblet
Name: 0, dtype: object

Answer 2

应该起作用的另一种变体（假设s是您的系列）类似于

s.apply(lambda r : re.sub('.*_([^_]*)$', '\\1', r))

How to match regex pattern and replace it with a matching group using Pandas?

Question

2 answers

solution1
4 ACCPTED 2018-01-28 22:13:21

solution2
2 2018-01-28 22:32:28

How to match regex pattern and replace it with a matching group using Pandas?

Question

2 answers

solution1 4 ACCPTED 2018-01-28 22:13:21

solution2 2 2018-01-28 22:32:28

solution1
4 ACCPTED 2018-01-28 22:13:21

solution2
2 2018-01-28 22:32:28