I have a test file that looks like that:
t # 3-0, 1
v 0 0
v 1 19
v 2 2
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-1, 1
v 0 0
v 1 15
v 2 2
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-2, 1
v 0 0
v 1 17
v 2 2
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-3, 1
v 0 0
v 1 18
v 2 7
u 0 1 2
u 0 2 2
u 1 2 2
I wrote the following code to match the last three lines of the transaction (each transaction starts with t #
)
#!/usr/bin/perl -w
use strict;
my $input = shift @ARGV or die $!;
open (FILE, "$input") or die $!;
LOOP: while (<FILE>) {
if (m/^(t\h*#\h*[0-9,\h-]+)/) {
my $transaction_id = $1;
while (<FILE>) {
if (m/^(u\h+[0]\h+[1]\h+[2])/) {
my $edge_1 = $1;
while (<FILE>) {
if (m/^(u\h+[0]\h+[2]\h+[2])/) {
my $edge_2 = $1;
while (<FILE>) {
if (m/^(u\h+[1]\h+[2]\h+[2])/) {
my $edge_3 = $1;
print $transaction_id . "\t" . $edge_1 . "\t" . $edge_2 . "\t" . $edge_3 . "\n";
next LOOP;
}
}
}
}
}
}
}
}
close FILE;
However, it does not print any results. When I compile my program, it runs without errors. My ultimate goal is to produce output like this, where I output edges of subgraphs "u 0 1 2", "u 0 2 2" and "u 1 2 2":
t # 3-0, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-1, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-2, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-3, 1 u 0 1 2 u 0 2 2 u 1 2 2
One way: Keep all lines for a transaction in a buffer, and when you get to a new transaction id store the previous one, along with the last three lines from that buffer
use warnings;
use strict;
use feature 'say';
my (@transactions, @trans_lines, $tid);
while (<>) {
chomp;
if (/^(t\s*#\s*[0-9,\s-]+)/) {
if (not $tid) {
$tid = $1; # the very first one starts
next;
}
# Store previous id and its last three lines, reset
push @transactions, [ $tid, @trans_lines[-3..-1] ];
$tid = $1;
@trans_lines = ()
}
push @trans_lines, $_;
}
say "@$_" for @transactions;
This stores all transactions in an array, so they are easily iterated and maintain the order from the file. This supports the use of results demonstrated in the question. But with an array one can't easily refer to a particular one, and if it is of interest to be able to look up particular id's consider using a hash of array references instead, like in the related problem .
The above code relies on there always being three lines in a transaction, as implicit in the question. I'd recommend adding a check.
The construct while (<>)
reads lines of all files given on the command line, or STDIN
.
Some comments on the posted code
The use warnings;
is better than using -w
switch
The $!
variable holds the error string. While it should indeed be used ubiquitously, if @ARGV
is empty the shift
returns an undef
and there is no error; so $!
is not set. Instead, do something like
my $file = shift @ARGV // die "Usage: $0 file\n";
or, better yet, invoke your routine with a fuller usage message, etc.
Use lexical filehandles , open my $fh, '<', $file or die $;;
, as they are plainly better in multiple ways than globs ( FH
)
There is no need to double-quote a lone scalar variable, as it will get evaluated anyway (while excessive quoting can even lead to subtle problems in some situations)
Nesting loops that read from the same resource (filehandle here) is legitimate and has its uses, but it adds a layer of complexity and makes the code harder to track. I'd use it very, very sparingly. Multiple levels of nesting add that much more complexity.
I don't readily see why the code in the questions doesn't work. Add printing statements?
Your code as it is gives me this output:
t # 3-0, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-1, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-2, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-3, 1 u 0 1 2 u 0 2 2 u 1 2 2
So it seems the problem is with something that you haven't shown us. Perhaps the input file comes from a different system and has line endings that your system doesn't recognise.
Your nested while
loops and if
conditions make the code more complex than it needs to be (and, therefore, harder to maintain). You can do it all in one loop using something like this:
#!/usr/bin/perl
use strict;
use warnings;
my $input = shift @ARGV or die $!;
open (my $fh, '<', $input) or die $!;
my ($transaction_id, $edge_1, $edge_2, $edge_3);
while (<$fh>) {
if (m/^(t\h*#\h*[0-9,\h-]+)/) {
$transaction_id = $1;
} elsif (m/^(u\h+[0]\h+[1]\h+[2])/) {
$edge_1 = $1;
} elsif (m/^(u\h+[0]\h+[2]\h+[2])/) {
$edge_2 = $1;
} elsif (m/^(u\h+[1]\h+[2]\h+[2])/) {
$edge_3 = $1;
}
if ($transaction_id and $edge_1 and $edge_2 and $edge_3) {
print "$transaction_id\t$edge_1\t$edge_2\t$edge_3\n";
($transaction_id, $edge_1, $edge_2, $edge_3) = (undef) x 4;
}
}
(Note, I've also replaced -w
with use warnings
and switched to using a lexical filehandle and the three-arg version of open()
. All of these are Modern Perl best practices.)
Would you please try the following:
#!/usr/bin/perl -w
my $ref;
open(FH, shift) or die;
while (<FH>) {
chop;
if (/^t\s*#/) { # if a new transaction starts
$ref = []; # then create a new reference to an array
push(@refs, $ref); # and memorize the reference
}
push(@$ref, $_); # append the line to the current array
}
for $ref (@refs) {
print(join(" " x 4, $ref->[0], $ref->[-3], $ref->[-2], $ref->[-1]), "\n");
}
Output:
t # 3-0, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-1, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-2, 1 u 0 1 2 u 0 2 2 u 1 2 2
t # 3-3, 1 u 0 1 2 u 0 2 2 u 1 2 2
Define regex patterns $skip
, $data
and $tran
, walk through data, assemble transaction line and push into array when new transaction starts
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $skip = qr/^v \d+ \d+/;
my $data = qr/^u \d+ \d+ \d+/;
my $tran = qr/^t # \d-\d, \d/;
my @array;
my $line = <DATA>;
chomp($line);
while( <DATA> ) {
next if /$skip/;
chomp;
$line .= ' ' . $_ if /$data/;
if( /$tran/ ) {
push @array, $line;
$line = $_;
}
}
push @array, $line;
say Dumper(\@array);
__DATA__
t # 3-0, 1
v 0 0
v 1 19
v 2 2
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-1, 1
v 0 0
v 1 15
v 2 2
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-2, 1
v 0 0
v 1 17
v 2 2
u 0 1 2
u 0 2 2
u 1 2 2
t # 3-3, 1
v 0 0
v 1 18
v 2 7
u 0 1 2
u 0 2 2
u 1 2 2
Output
$VAR1 = [
't # 3-0, 1 u 0 1 2 u 0 2 2 u 1 2 2',
't # 3-1, 1 u 0 1 2 u 0 2 2 u 1 2 2',
't # 3-2, 1 u 0 1 2 u 0 2 2 u 1 2 2',
't # 3-3, 1 u 0 1 2 u 0 2 2 u 1 2 2'
];
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.