简体   繁体   中英

Replace second group of a regex match on a pandas dataframe

I have a dataframe of about 1000 rows and my requirement is to replace all characters that appear after username: to a common string (say 'users').

I'm making use of the following regex that suits my problem and I can match all usernames in the second group which I want to replace with 'users'

Regex:

"(?i)(\busername\b\s?|\buname\s?)+[;|:]
(\s?[a-z-A-Z0-9@:!+=#$%^&*-]{5,})"

Test data:

 username : user111
    uname : user212

Expected Output:

username : users
uname : users

Also I wanted to do this operation on a large dataset so i'm looking if there are any efficient ways of achieving this.

I'm sure you could use regex for this, but sometimes the simplest thing to to is split and join such as:

df = pd.DataFrame({'values':['username : user111','uname : user212']})

df['values'].apply(lambda x: ': '.join([x.split(':')[0], 'users']))

Or if you'd like to avoid lambda:

df['values'].str.split(':').str.get(0) + ': users'

Output

             values
0  username : users
1     uname : users

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM