I have a table separated by ',' I want to order and test if the value from num+1 exists in num column, or the value from num+2 exists in num, or the value from num+3 field exists in num, or the value from num+4 field exists in num for each row then delete the row if true.
my script is:
#!"C:\perl\bin\perl.exe"
use strict;
use warnings;
my $file_name = shift @ARGV;
die "Usage ./$1 <file_to_be_processed> > <output_file>" unless defined $file_name;
my $dic; # This is going to hold all values to be excluded.
open IN, "<", $file_name or die "Could not open $file_name $!\n";
while(<IN>) {
chomp;
@_ = split /,/;
shift @_;
map{$dic->{$_}++} @_;
}
close IN;
open IN, "<", $file_name or die "Could not open $file_name $!\n";
while(<IN>) {
chomp;
@_ = split /,/;
print $_."\n" unless defined $dic->{$_[0]};
}
close IN;
there is my table:
num,num+1,num+2,num+3,num+4
1014,1015,1016,1017,1018
1015,1016,1017,1018,1019
1019,1020,1021,1022,1023
1025,1026,1027,1028,1029
1030,1031,1032,1033,1034
there is expected result:
num,num+1,num+2,num+3,num+4
1014,1015,1016,1017,1018
1019,1020,1021,1022,1023
1025,1026,1027,1028,1029
1030,1031,1032,1033,1034
My script works but excludes num 1019 from result, there is output of actual script:
num,num+1,num+2,num+3,num+4
1014,1015,1016,1017,1018
1025,1026,1027,1028,1029
1030,1031,1032,1033,1034
Looks like a misuse of map
to me - if you're not using the result of map
then you should probably be using a for
loop instead.
But that aside - what you're doing here is creating a list of every symbol in column 1,2,3,4 and excluding a line if it exists?
Because your sample data includes 1019 on the previous line, that's why it's being excluded.
Your $dic
looks like:
$VAR1 = {
'1016' => 2,
'1021' => 1,
'1022' => 1,
'1028' => 1,
'1034' => 1,
'num+3' => 1,
'1017' => 2,
'num+1' => 1,
'1015' => 1,
'1020' => 1,
'num+2' => 1,
'1023' => 1,
'1026' => 1,
'1019' => 1,
'1031' => 1,
'1027' => 1,
'1032' => 1,
'1033' => 1,
'1018' => 2,
'1029' => 1,
'num+4' => 1
};
As 1019 is in it, the 1019 line gets skipped.
Also:
Data::Dumper
is useful for seeing what's in a data structure. map
like that. Use a for
loop. Something like:
while(<IN>) {
chomp;
$dic->{$_}++ for split /,/;
}
don't use @_
like that - it's a special variable, with a specific meaning. Call it something else.
current good practice is to use a lexical filehandle with open
. eg open ( my $in, '<', $filename ) or die $!;
because of keeping the scope down.
if you just want to check the first column, you can assign like this: my ( $col, @rest ) = split /,/;
and just test $col
. Or you can skip the chomp
entirely and just do:
print unless defined $dic->{(split /,/)[0]};
You have to dynamically change your hash when you skip a line:
#!/usr/bin/perl
use warnings;
use strict;
my %dic;
my $pos = tell DATA; # Remember where the data start.
while (<DATA>) {
chomp;
my @ar = split /,/;
# Fix SO syntax highlighting error: /
shift @ar;
$dic{$_}++ for @ar;
}
seek DATA, $pos, 0; # Back to the data start.
while (<DATA>) {
chomp;
my @ar = split /,/;
if ($dic{ $ar[0] }) {
delete $dic{ $_ } for @ar[1 .. $#ar]; # <-- this was missing!
} else {
print "$_\n";
}
}
__DATA__
1014,1015,1016,1017,1018
1015,1016,1017,1018,1019
1019,1020,1021,1022,1023
1025,1026,1027,1028,1029
1030,1031,1032,1033,1034
1019 is being skipped because even though the 1015 row will be skipped in the second loop, the 1019 key was already defined for $dic
in your first loop.
map{$dic->{$_}++} @_;
On the second iteration of the first loop (the 1015 row), that line is setting keys 1016, 1017, 1018, and 1019 (to 1). Then in your second loop:
print $_."\n" unless defined $dic->{$_[0]};
your unless
skips 1015, but doesn't do anything to remove the keys that the 1015 row defined with it, so it continues to remove the 1019 row.
If I understand you correctly then this is all you need
use strict;
use warnings;
my %seen;
while ( <DATA> ) {
chomp;
my @fields = split /,/;
if ( not $seen{ shift @fields } ) {
$seen{$_} = 1 for @fields;
print "$_\n";
}
}
__DATA__
1014,1015,1016,1017,1018
1015,1016,1017,1018,1019
1019,1020,1021,1022,1023
1025,1026,1027,1028,1029
1030,1031,1032,1033,1034
1014,1015,1016,1017,1018
1019,1020,1021,1022,1023
1025,1026,1027,1028,1029
1030,1031,1032,1033,1034
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.