简体   繁体   中英

How to replace consecutive and identical characters in Perl?

I have a string like XXXXYYYYZZZYYZZZYYYY which needs to be converted to XXXXAAAYZZZAYZZZAAAY

$s =~ s/Y{2}+/AY/g;

this has 2 problems, {2}+ will get YYYY to AYAY; and AY is not the same length as YYYY (expecting AAAY )

How to get this done in perl?

Use a "look-ahead":

$s =~ s/Y(?=Y+)/A/g;

(?=Y+) means "followed by one or more Y characters", so any Y character that is followed by another Y character will be replaced with an A .

More info from perlretut

There's always more than one way to do it. My suggestion is to grab all the Ys except the last one, and then use that to create a string of As of the same length. The e modifier tells perl to execute the code in the replacement side instead of using it directly, and the r modifier tells =~ to return the result of the substitution instead of modifying the input text directly (useful for these one-liner tests, among other places).

$ perl -E 'say shift =~ s/(Y+)(?=Y)/"A"x length$1/gre' XXXXYYYYZZZYYZZZYYYY
XXXXAAAYZZZAYZZZAAAY

$s =~ s/Y{2}+/AY/g
RHS Pattern is ambiguously obscure pattern: Y{2}+ , that's very rarely used regex pattern except if {}+ very rarely is available in few advanced regex engine, including perl maybe, as a regex feature called 'atomic grouping'.
You might have meant (Y{2})+ which is (YY)+ or Y{2,} which is YY+
in perl it's no brainer simple and easy as it supports lookaround feature

perl -e '$s=XXXXYYYYZZZYYZZZYYYY ;$s =~ s/Y(?=Y)/A/g;print $s'

actually lower regex engine such sed still can do it albeit in cumbersome, uneasy way

echo XXXXYYYYZZZYYZZZYYYY |sed -E 's/YY+/&\n/g;s/Y/A/g;s/A\n/Y/g'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM