in this string
ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<
I want to replace each substring between XY
and <
by either ONE or TWO depending on characters between previous brackets:
if XY
after (CN)
replace substring by ONE
if XY
after (CI)
replace substring by TWO
So the result should be:
ab<(CN)cdONE<(CI)efgTWO<(CN)zONE<(CI)efgTWO<
XY
and following characters should be replaced but not angle bracket <
.
This is for modifying HTML and arbitrary characters can occur between XY
and <
. I guess I need two regex for (CN) and (CI).
# This one replaces just all XY:
my $s = 'ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
$s =~ s/(XY(.*?))</ONE/g;
# But how to add the conditions to the regex?
You don't need two regexes. Capture the C[NI]
and retrieve the corresponding replacement value from a hash:
#!/usr/bin/perl
use warnings;
use strict;
my $s = 'ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
my %replace = (CN => 'ONE', CI => 'TWO');
$s =~ s/(\((C[NI])\).*?)XY.*?</$1$replace{$2}</g;
my $exp = 'ab<(CN)cdONE<(CI)efgTWO<(CN)zONE<(CI)efgTWO<';
use Test::More tests => 1;
is $s, $exp;
My guess is that this expression or maybe a modified version of that might work, not sure though:
([a-z]{2}<\([A-Z]{2}\)[a-z]{2})([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)(<\([A-Z]{2}\)[a-z])([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)<
use strict;
use warnings;
my $str = 'ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
my $regex = qr/([a-z]{2}<\([A-Z]{2}\)[a-z]{2})([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)(<\([A-Z]{2}\)[a-z])([^<]+)(<\([A-Z]{2}\)[a-z]{3})([^<]+)</mp;
my $subst = '"$1ONE$3TWO$5ONE$7TWO<"';
my $result = $str =~ s/$regex/$subst/rgee;
print $result;
The expression is explained on the top right panel of this demo , if you wish to explore/simplify/modify it, and in this link , you can watch how it would match against some sample inputs step by step, if you like.
This can be done in one line regex using /e
and ternary operator ?
in the /replace/
. /r
option returns the resulting string, in effect this would keep the original string $s
unmodified.
use strict;
use warnings;
my $s ='ab<(CN)cdXYlm<(CI)efgXYop<(CN)zXYklmn<(CI)efgXYuvw<';
print (($s=~s/\(([^)]+)\)([^(]+)XY[^(]+</"($1)$2".(($1 eq CN)?ONE:TWO)."<"/gre)."\n");
ab<(CN)cdONE<(CI)efgTWO<(CN)zONE<(CI)efgTWO<
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.