I have the following perl code. I am trying to grep the path from an array links and append "\\" or "/" at the end and push it into new array. But I am not getting desired output. What am I missing ?
use strict;
my @links = (
"incl -s projectA /. /abc/cde/efg",
"incl -s projectA \. \hij\klm\nop",
);
my ( $path, $link, @linkpaths, $op );
my $substr = "/";
foreach $link (@links) {
$link =~ m{incl -s projectA /. /|\\.\\(.+)};
$path = $1;
print "Path is $path \n";
if ( index( $path, $substr ) != -1 ) {
print "$link contains $substr\n";
$op = "/";
} else {
print "$link doesnt contains $substr\n";
$op = "\\";
}
push @linkpaths, $path . $op;
}
print "\nlinkpaths:\n";
foreach (@linkpaths) {
print "$_\n";
}
Desired Output:
Path is abc/cde/efg
abc/cde/efg contains /
Path is \hij\klm\nop
hij\klm\nop doesnt contain /
linkpaths:
abc/cde/efg/
hij\klm\nop\
The problem is that the special characters in your strings -- both simple strings and regular expressions -- are not escaped, and you have no use warnings
at the top of your program, which would have alerted you to this.
For instance, if I add use warnings
and use Data::Dump
to display your @links
array I gets this
Unrecognized escape \h passed through at E:\Perl\source\dd.pl line 8.
Unrecognized escape \k passed through at E:\Perl\source\dd.pl line 8.
[
"incl -s projectA /. /abc/cde/efg",
"incl -s projectA . hijklm\nop",
]
So some of the backslashes in the second element have vanished.
Now the regex looks fine on the face of it, but I hope it is clear that your alternation extends to the full length of the pattern, so
m{incl -s projectA /. /|\\.\\(.+)}
matches either
incl -s projectA /. /
or
\\.\\(.+)
which isn't at all what you had in mind. You also need to escape the dots .
which otherwise match any character other than a newline; and you have dropped a space, so you currently have either /. /
/. /
(with an internediate space) or \\.\\
(without one).
It's a little trickier to fix than you might hope because (I think) you want to capture everything after projectA
, but also allow for either forward or backward slashes. That would become
m{incl -s projectA ((?:/\. /|\\\. \\).+)}
which, employing the /x
modifier and replacing literal spaces with \\s+
, I hope you'll agree can be more clearly written
m{ incl \s+ -s \s+ projectA \s+ ( (?: /\. \s+ / | \\\. \s+ \\ ) .+ ) }x
Here's a fixed version of your code that includes all of the changes I have described.
use strict;
use warnings;
my @links = (
'incl -s projectA /. /abc/cde/efg',
'incl -s projectA \. \hij\klm\nop',
);
my ($path, $link, @linkpaths, $op);
my $substr = "/";
for my $link (@links) {
$link =~ m{incl \s+ -s \s+ projectA \s+ ( (?: /\. \s+ / | \\\. \s+ \\) .+ )}x;
$path = $1;
print "Path is $path \n";
if (index($path, $substr) >= 0) {
print "$link contains $substr\n";
$op = "/";
}
else {
print "$link doesn't contain $substr\n";
$op = "\\";
}
push @linkpaths, "$path$op";
}
print "\n";
print "linkpaths:\n";
print "$_\n" for @linkpaths;
output
Path is /. /abc/cde/efg
incl -s projectA /. /abc/cde/efg contains /
Path is \. \hij\klm\nop
incl -s projectA \. \hij\klm\nop doesn't contain /
linkpaths:
/. /abc/cde/efg/
\. \hij\klm\nop\
Update
To capture only the last path in each element of the input list that starts with a slash or backslash, I would replace the end of the pattern with this (?: /\\. \\s+ | \\\\\\. \\s+ ) (.+)
instead. But I believe it's far tider to use a character class to represent either a forward or a backward slash, like [/\\\\]
.
This is another change to your complete program
use strict;
use warnings;
my @links =(
'incl -s projectA /. /abc/cde/efg',
'incl -s projectA \. \hij\klm\nop',
);
my @linkpaths;
my $substr = '/';
for (@links) {
next unless my ($path) = m{ incl \s+ -s \s+ projectA \s+ [/\\]\. \s+ ([/\\].+) }x;
print "Path is $path\n";
my $op;
if ($path =~ /\Q$substr/) {
printf "%s contains %s\n", $_, $substr;
$op = '/';
}
else {
printf "%s doesn't contain %s\n", $_, $substr;
$op = '\\';
}
push @linkpaths, "$path$op";
}
print "\n";
print "linkpaths:\n";
print "$_\n" for @linkpaths;
output
Path is /abc/cde/efg
incl -s projectA /. /abc/cde/efg contains /
Path is \hij\klm\nop
incl -s projectA \. \hij\klm\nop doesn't contain /
linkpaths:
/abc/cde/efg/
\hij\klm\nop\
Probably want a regex like this
# m{incl[ ]-s[ ]projectA(?|[ ]/\.[ ](/)|[ ]\\\.[ ](\\))((?:(?!\1$).)+)$}g
incl [ ] -s [ ] projectA
(?|
[ ] /\. [ ]
( / ) # (1)
| [ ] \\\. [ ]
( \\ ) # (1)
)
( # (2 start)
(?:
(?! \1 $ )
.
)+
) # (2 end)
$
Sample:
use strict;
use warnings;
my @links =(
'incl -s projectA /. /abc/cde/efg',
'incl -s projectA \. \hij\klm\nop'
);
my ($path,$link,@linkpaths,$op);
my $substr="/";
for (@links) {
if ( m{incl[ ]-s[ ]projectA(?|[ ]/\.[ ](/)|[ ]\\\.[ ](\\))((?:(?!\1$).)+)$}g )
{
($op, $path) = ($1,$2);
print "Path is $path \n";
if ($op eq '/' ) {
print "$path contains /\n";
}
else {
print "$path doesnt contain /\n";
}
push @linkpaths, $path . $op;
}
}
print "\nlinkpaths:\n";
for (@linkpaths) {
print "$_\n";
}
Output:
Path is abc/cde/efg
abc/cde/efg contains /
Path is hij\klm\nop
hij\klm\nop doesnt contain /
linkpaths:
abc/cde/efg/
hij\klm\nop\
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.