[英]Regular expression replace
I need a Reg Ex script 我需要一个Reg Ex脚本
example: 例:
#!/usr/bin/env perl
use 5.10.0;
use strict;
use warnings;
my @samples = (
"Mike&Ike" => "MikeIke",
"Mike-Ike" => "Mike-Ike",
"Mike-Ike-Jill" => "Mike-Ike-Jill",
"Mike--Ike-Jill" => "Mike-Ike-Jill",
"Mike--Ike---Jill" => "Mike-Ike-Jill",
"Mike.Ike.Bill" => "Mike.IkeBill",
"Mike***Joe" => "MikeJoe",
"Mike123" => "Mike123",
);
while (my($got, $want) = splice(@samples, 0, 2)) {
my $had = $got;
for ($got) {
# 1) Allow max 1 dashy bit connected to each other.
s/ ( \p{Dash} ) \p{Dash}+ /$1/xg;
# 2) Allow max 1 period, total.
1 while s/ ^ [^.]* \. [^.]* \K \. //x ;
# 3) Remove all symbols...
s/ (?! [\p{Dash}.] ) [\p{Symbol}\p{Punctuation}] //xg ;
# ...and punctuation
# except for dashy bits and dots.
}
if ($got eq $want) { print "RIGHT" }
else { print "WRONG" }
print ":\thad\t<$had>\n\twanted\t<$want>\n\tgot\t<$got>\n";
}
Generates: 产生:
RIGHT: had <Mike&Ike>
wanted <MikeIke>
got <MikeIke>
RIGHT: had <Mike-Ike>
wanted <Mike-Ike>
got <Mike-Ike>
RIGHT: had <Mike-Ike-Jill>
wanted <Mike-Ike-Jill>
got <Mike-Ike-Jill>
RIGHT: had <Mike--Ike-Jill>
wanted <Mike-Ike-Jill>
got <Mike-Ike-Jill>
RIGHT: had <Mike--Ike---Jill>
wanted <Mike-Ike-Jill>
got <Mike-Ike-Jill>
RIGHT: had <Mike.Ike.Bill>
wanted <Mike.IkeBill>
got <Mike.IkeBill>
RIGHT: had <Mike***Joe>
wanted <MikeJoe>
got <MikeJoe>
RIGHT: had <Mike123>
wanted <Mike123>
got <Mike123>
you could do something with several passes. 您可以通过几遍来做点什么。
it's kind of generic workaround that could be shorted by using lookbehind. 这是一种通用的解决方法,可以通过使用lookbehind来缩短。
(not all regex flavors do support this) (并非所有的正则表达式都支持此功能)
-
with regex -{2,}
移除多个-
与正则表达式-{2,}
-.
除去-.
以外的其他符号-.
with regex [^-\\.A-Za-z0-9]
与正则表达式[^-\\.A-Za-z0-9]
.
首先替换.
with a temp character eg !
带有临时字符,例如!
and replace remaining .
并替换剩余的.
!
更换!
from last step with .
从最后一步开始.
update using C# .net 使用C#.net 更新
(I'm not a C# programmer, used this regex tester and this reference for C# .net regex flavor.) (我不是C#程序员,使用此regex测试器和C#.net regex风格的此参考 。)
String str = "Mike&Ike ......";
str = Regex.Replace( str, @"-+", @"-" );
str = Regex.Replace( str, @"(?<=\.)(.*?)\.", @"$1" );
str = Regex.Replace( str, @"[^\w\r\n]", @"" );
-
with single -
更换multipe -
单-
.
删除.
if it's not the first .
如果不是第一个.
using positiv lookbehind (?<=...)
使用positiv lookbehind (?<=...)
\\w
is short for [A-Za-z0-9]
删除符号(实际上不是单词或换行符) \\w
是[A-Za-z0-9]
缩写
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.