简体   繁体   中英

Extracting last occurrence of a certain string from each row in a pandas series

from each row i want to extract the last occurrence of the word "user" + the number that follow right after it from a pandas series. everything else can be discarded. how would you perform this action? thanks!!!

here's an example of the series:

0                         1 - Unassigned, 2 - User 397335
1         1 - Unassigned, 2 - User 525767, 3 - Unassigned
2                                          1 - Unassigned
3                                          1 - Unassigned
4                                          1 - Unassigned
                               ...                       
163678                                     1 - Unassigned
163679    1 - Unassigned, 2 - User 347991, 3 - Unassigned
163680                                     1 - Unassigned
163681                                     1 - Unassigned
163682    1 - Unassigned, 2 - User 663455, 3 - Unassigned

Use str.findall :

>>> df['A'].str.findall(r'User \d+').str[-1]

0         User 397335
1         User 525767
2                 NaN
3                 NaN
4                 NaN
163678            NaN
163679    User 347991
163680            NaN
163681            NaN
163682    User 663455
Name: A, dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM