I need a way to count the number of matches in a regex capture group using either Perl or Bash. I can do this in Powershell but not in either of these languages. You guys have helped me get my Regex working, but every example I see just prints the capture groups. Printing the match results doesn't help me, I need to count the number of matches in each group.
Here is example data for regexing (this is the output of a command, so is not static data, nor is it from a file)
JobID Type State Status Policy Schedule Client Dest Media Svr Active PID
41735 Backup Done 0 Policy_name_here daily hostname001 MediaSvr1 8100
41734 Backup Done 0 Policy_name_here daily hostname002 MediaSvr1 7803
41733 Backup Done 0 Policy_name_here daily hostname004 MediaSvr1 7785
41732 Backup Done 0 Policy_name_here daily hostname005 MediaSvr1 27697
41731 Backup Done 0 Folicy_name_here daily hostname006 MediaSvr1 27523
41730 Backup Done 0 Policy_name_here daily hostname007 MediaSvr1 27834
41729 Backup Done 0 Policy_name_here - hostname008 MediaSvr1 27681
41728 Backup Done 0 Policy_name_here - hostname009 MediaSvr1 27496
41727 Catalog Backup Done 0 catalog full hostname010 MediaSvr1 27347
41712 Catalog Backup Done 0 catalog - hostname004 30564
I cant use named capture groups as I am using Perl 5.8.5
my regex
/(\\d+)?\\s+((\\b[^\\d\\W]+\\b)|(\\b[^\\d\\W]+\\b\\s+\\b[^\\d\\W]+\\b))?\\s+((Done)|(Active)|(\\w+\\w+\\-\\w\\-+))?\\s+(\\d+)?\\s+((\\w+)|(\\w+\\_\\w+)|(\\w+\\_\\w+\\_\\w+))?\\s+((b[^\\d\\W]+\\b\\-\\b[^\\d\\W]+\\b)|(\\-)|(\\b[^\\d\\W]+\\b))?\\s+((\\w+\\.\\w+\\.\\w+)|(\\w+))?\\s+((\\w+\\.\\w+\\.\\w+)|(\\w+))?\\s+(\\d+)?/g
Each capture group corresponds to a column and I need to pull the results of the capture group into a variable, so I can count using some kind of where {$var -eq '0'}.count
code. Assuming Status -eq '0'
denotes a successful backup, I need to count the number of successful backups in the Status capture group.
Final output is something like
Statistic.SUCCESSFUL: 20
I've accomplished this already using Powershell, but Perl is completely different and Bash seems limited. If anyone knows how to accomplish the aforementioned in either of these Languages I'd appreciate some help.
Kind Regards,
DJ
<>; # Skip header
my $successes = 0;
while (<>) {
chomp;
my @row = /.../
or do {
die("Line $. doesn't match pattern\n");
next;
};
++$successes if $row[3] eq '0';
}
You could also name the columns.
<>; # Skip header
my $successes = 0;
while (<>) {
chomp;
my %row;
@row{qw( JobID Type State Status ... )} = /.../
or do {
die("Line $. doesn't match pattern\n");
next;
};
++$successes if $row{Status} eq '0';
}
Finally, if you want to store the data in a data structure for later analysis, that's possible too.
<>; # Skip header
my @rows;
while (<>) {
chomp;
my %row;
@row{qw( JobID Type State Status ... )} = /.../
or do {
die("Line $. doesn't match pattern\n");
next;
};
push @rows, \%row;
}
my $successes = grep { $_->{Status} eq '0' } @rows;
Finally, that regex pattern is ...awful. I'd go with something like this:
sub trim(_) { $_[0] =~ s/^\s++|\s++\z//rg }
my $pattern;
my @headers;
{
my $header_line = <>;
chomp($header_line);
$header_line =~ s/\bDest Media Svr\b/Dest_Media_Svr/;
$header_line =~ s/\bActive PID\b/Active_PID/;
$pattern = join '', map { "A".length($_) } $header_line =~ /\s*\S+/g;
@headers = map trim, unpack $pattern, $header_line;
}
my @rows;
while (<>) {
chomp;
my %row; @row{@headers} = map trim, unpack $pattern, $_;
push @rows, \%row;
}
my $successes = grep { $_->{Status} eq '0' } @rows;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.