Suppose I have a huge xml file that contains a bunch of information including email addresses. So all email addresses will be something like the following:
user @gmail.com
The issue I'm running into deals with regular expressions. How do i match on the email address but only replace the user portion? I tried using look ahead anchors with no luck, (it ends up replacing EVERYTHING before the @gmail.com) Is there a way to use look-ahead but only up to the white-space before user? Or is there a simple solution to this? Right now I have something like the following:
perl 's/(?=@gmail.com)/replacement$&/ge' file.xml
which doesn't work obviously. Any help is much appreciated!
使用除了空格和@之外的所有字符类:
s/[^\s@]+(?=@gmail\.com)/replacement/g
You could always just use the html5 email validator to get the user name.
http://www.w3.org/TR/html5/forms.html#valid-e-mail-address
$string =~ s/[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+(@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*)/$1/g;
Expanded:
[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+
( # (1 start)
@
[a-zA-Z0-9]
(?:
[a-zA-Z0-9-]{0,61}
[a-zA-Z0-9]
)?
(?:
\.
[a-zA-Z0-9]
(?:
[a-zA-Z0-9-]{0,61}
[a-zA-Z0-9]
)?
)*
) # (1 end)
s/ (\S+)@gmail\.com/replacement string/g;
I think this will resolve your problem for this scenario
<email>this is user@gmail.com</email>
This regex
s/([^>]+)@gmail\.com/replacement string/g
will resolve this scenario
<email>user@gmail.com</email>
And this
s/([^"]+)@gmail\.com/replacement string/g
will resolve this
<person email="user@gmail.com"></person>
So combined, we have
s/(\S+|[^>]+|[^"]+)@gmail\.com/replacement string/g
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.