简体   繁体   中英

Reg ex find and replace with groups

I'm trying to find the screen names in each line of text eg. screen_name: CoinLibre2009. After finding every screen name I have to replace each screen name with the original first character then exactly four asterisks (****) and then the last character. For example the screen name CoinLibre2009 would become C****9. I'm thinking that I need to use groups and that I should include the "screen_name: " in my find, and just include it back in with the replace.

Here are a few lines of the text I'm working with:

posted: Sat Feb 03 2018 11:03:09    text: Today we can see positive trends for growth, but will there be a new fall? crypto screen_name: Ksandimo   location: null  verified: false followers_count: 1597   friends_count: 17   lang: ru    retweet_count: 0    favorite_count: 0
posted: Sat Feb 03 2018 11:03:14    text: 8745.02$ per now  screen_name: CoinLibre2009  location: Free World    verified: false followers_count: 113    friends_count: 110  lang: ru    retweet_count: 0    favorite_count: 0
posted: Sat Feb 03 2018 11:03:16    text: Current price of is $8745.02  screen_name: bitcoinavg location: null  verified: false followers_count: 44 friends_count: 9    lang: en    retweet_count: 0    favorite_count: 0
posted: Sat Feb 03 2018 11:03:25    text: Think weve hit resistance for Bitcoin now. Will it fully recover? Im not sure screen_name: jasongaved location: Brighton & Hove / London  verified: false followers_count: 1996   friends_count: 1967 lang: en    retweet_count: 0    favorite_count: 0
posted: Sat Feb 03 2018 11:03:28    text: Today's price is $8745.02 as of February 3, 2018 at 11:59AM   screen_name: FR33Q  location: Europe    verified: false followers_count: 1164   friends_count: 1998 lang: en    retweet_count: 0    favorite_count: 0

Also here is a screenshot of what the data looks like in notepad ++: 在此处输入图片说明

I'm using reg ex in Notepad++ for this task. Here is what I have come up with so far. screen_name:\\s[A-Za-z0-9]+ Then this is where I get stuck, as I'm not sure how to replace the first and last characters.

You can use capturing groups in the regex pattern and replacement back references (also called placeholders) in the replacement pattern. Besides, if you want to match letters, digits and underscores , use \\w instead of a custom [a-zA-Z0-9] .

Use

(screen_name:\s\w)\w*(\w)

The (screen_name:\\s\\w) captures the screen_name: and a whitespace into Group 1 later referred to as $1 from the replacement pattern, \\w* just matches 0+ word chars and then (\\w) matches and captures a single word char into Group 2 later referred to as $2 from the replacement pattern.

Replace with $1****$2 .

See the regex demo .

在此处输入图片说明

(screen_name:\s[A-Za-z0-9_])[A-Za-z0-9_]*([A-Za-z0-9_]) => $1****$2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM