简体   繁体   中英

perl find and replace a part of the matching string (regex issue)

Suppose I have a huge xml file that contains a bunch of information including email addresses. So all email addresses will be something like the following:

user @gmail.com

The issue I'm running into deals with regular expressions. How do i match on the email address but only replace the user portion? I tried using look ahead anchors with no luck, (it ends up replacing EVERYTHING before the @gmail.com) Is there a way to use look-ahead but only up to the white-space before user? Or is there a simple solution to this? Right now I have something like the following:

perl 's/(?=@gmail.com)/replacement$&/ge' file.xml

which doesn't work obviously. Any help is much appreciated!

使用除了空格和@之外的所有字符类:

s/[^\s@]+(?=@gmail\.com)/replacement/g

You could always just use the html5 email validator to get the user name.
http://www.w3.org/TR/html5/forms.html#valid-e-mail-address

$string =~ s/[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+(@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*)/$1/g;  

Expanded:

 [a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+ 
 (                                      # (1 start)
      @
      [a-zA-Z0-9] 
      (?:
           [a-zA-Z0-9-]{0,61} 
           [a-zA-Z0-9] 
      )?
      (?:
           \. 
           [a-zA-Z0-9] 
           (?:
                [a-zA-Z0-9-]{0,61} 
                [a-zA-Z0-9] 
           )?
      )*
 )                                      # (1 end)
s/ (\S+)@gmail\.com/replacement string/g;

I think this will resolve your problem for this scenario

<email>this is user@gmail.com</email>

This regex

s/([^>]+)@gmail\.com/replacement string/g

will resolve this scenario

<email>user@gmail.com</email>

And this

s/([^"]+)@gmail\.com/replacement string/g

will resolve this

<person email="user@gmail.com"></person>

So combined, we have

s/(\S+|[^>]+|[^"]+)@gmail\.com/replacement string/g

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM