简体   繁体   中英

extract and assign matched regex pattern from string in Perl to a variable

Okay, I tried some options but I am not getting it right - looks like its a problem with my regex specification but it might be other syntaxes as well. Any help / direction is very much appreciated.

I am trying to read a CSV file and processing one line at a time - discarding the Header line. I will focus on particularly two fields in that file.

Now after I read the file one line at a time, I am trying to process the two fields as such:

while ( my $line = <$data> ) {
    chomp $line;
    if ( $line !~ /^Date/ ) {
        if ( $line =~ /"/ ) { $line =~ s|"||g }

        ...;

        my $homeTeam = getTeam( $fields[5] );
        my $awayTeam = getTeam( $fields[7] );

        ...;

        my $arbiterRec = join ",", $gameDate, $gameTime, "", $season, $gameLevel,
            $homeTeam, "", $awayTeam, "", $site, $subSite, "", "";
        print "$arbiterRec\n";
    }
}

sub getTeam {
    my ($team) = trim( $_[0] ) =~ m{(R\d+-\d+B|G\d+$)}x;
    return $team;
}

sub trim {
    ( my $s = $_[0] ) =~ s/^\s+|\s+$//g;
    return $s;
}

With this, if I have an input like (fields of interests marked with ^^^):

mm/dd/yyyy, hh:mm AA, dd, Aaaaaa, aaD, R35-14G1, , U14 Girls Area Schedule R256-14G1, , AAA, , , 
                                       ^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I expect to get an output like:

mm/dd/yy, hh:mm AA, dd, Aaaaaa, aaD, R35-14G1, , R256-14G1, , AAA, , , 
                                     ^^^^^^^^    ^^^^^^^^^

In stead what I am getting is:

mm/dd/yy, hh:mm AA, dd, Aaaaaa, aaD, G1, , G1, , AAA, , , 
                                     ^^    ^^

Any idea what I might be doing wrong in the syntax or RegEx match?

Just change your regex to,

(R\d+-\d+(?:B|G)\d+$)

What's the actual problem is (R\\d+-\\d+B|G\\d+$) regex first check for the words starts with R followed by one or more digits again followed by - and finally B at the last. But in your input there isn't a word like this. So this would fail. Next it goes to the second part that starts with G , finally it matches the last G and the following one or more numbers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM