简体   繁体   中英

Perl: assignment within scalar and string matching (regex)

I understand the general aim of the following piece of code (ie sum up the numeric part of the string, eg for currstr="3S47M" then seqlength=50).

But could someone explain me what is happening line by line?

In particular, I have issue to understand what value where is holding at each turn. More precisely, I don't understand the part with the scalar function ("scalar($RLENGTH = length($&), $RSTART = length($`)+1)")?

Is it correct that the assignment of RLENGTH and RSTART take place inside scalar ?

Why using comma-separated assignment within scalar ? What does it mean? And what is then the result of its evaluation?

If anybody could help, I will be very very grateful !

Thanks

Erica

  my $seqlength=0; 
  my $currstr="3S47M";

  my $where = $currstr =~ /[0-9]+[M|D|N|X|=|S|H|N]/
    ? scalar($RLENGTH = length($&), $RSTART = length($`)+1) : 0;
  while ($where > 0) {
    $seqlength += substr($currstr, ($where)-1, $RLENGTH - 1) + 0;
    $currstr = substr($currstr, ($where + $RLENGTH)-1);
    $where = $currstr =~ /[0-9]+[M|D|N|X|=|S|H|N]/
      ? scalar($RLENGTH = length($&), $RSTART = length($`)+1) : 0;
  }

edit: what is the purpose of RSTART ? why writing scalar($RLENGTH = length($&) will not work?

$where = $currstr =~ /[0-9]+[M|D|N|X|=|S|H|N]/
  ? scalar($RLENGTH = length($&), $RSTART = length($`)+1) : 0;

is equivalent to

if ($currstr =~ /[0-9]+[M|D|N|X|=|S|H|N]/) {
   $where = scalar($RLENGTH = length($&), $RSTART = length($`)+1);
} else {
   $where =  0;
}

scalar is useless here. The expressions is already in scalar context. Simple parens would do.

When EXPRX, EXPRY is evaluated in scalar context, both EXPRX and EXPRY are evaluated in turn, and it results in the result of EXPRY . As such, the above is equivalent to

if ($currstr =~ /[0-9]+[M|D|N|X|=|S|H|N]/) {
   $RLENGTH = length($&);
   $RSTART = length($`) + 1;
   $where = $RSTART;
} else {
   $where =  0;
}

Note that [M|D|N|X|=|S|H|N] is a weird way of writing [MDX=SHN|] . The duplicate N and | are ignored. In fact, | is probably not supposed to be there at all. I suspect it's supposed to be [DHMNSX=] .


If I understand correctly, the code could have been written as follows:

my $currstr = "3S47M";

my $seqlength = 0; 
while ($currstr =~ /([0-9]+)[DHMNSX=]/g) {
   $seqlength += $1;
}

The following might even be sufficient (though not equivalent):

my $currstr = "3S47M";

my $seqlength = 0; 
while ($currstr =~ /[0-9]+/g) {
   $seqlength += $&;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM