如何使用Perl順序讀取文件？

Question

我有以下代碼。 它工作正常，但輸出與輸入文件的順序不同。 例如，我輸入的FASTA文件中有蛋白質列表。 我的輸出文件可以很好地運行我的代碼，但是蛋白質的順序似乎是隨機的。

我想念什么？

#!/usr/bin/perl
#usage: perl seqComp.pl <input_fasta_file> > <output_file>

use strict;

open( S, "$ARGV[0]" ) || die "cannot open FASTA file to read: $!";

my %s;      # a hash of arrays, to hold each line of sequence
my %seq;    #a hash to hold the AA sequences.
my $key;

while (<S>) {    #Read the FASTA file.
    chomp;
    if (/>/) {
        s/>//;
        $key = $_;
    } else {
        push( @{ $s{$key} }, $_ );
    }
}

foreach my $a ( keys %s ) {
    my $s = join( "", @{ $s{$a} } );
    $seq{$a} = $s;
    #print("$a\t$s\n");
}

my @aa = qw(A R N D C Q E G H I L K M F P S T W Y V);
my $aa = join( "\t", @aa );
#print ("Sequence\t$aa\n");

foreach my $k ( keys %seq ) {
    my %count;    # a hash to hold the count for each amino acid in the protein
    my @seq = split( //, $seq{$k} );
    foreach my $r (@seq) {
        $count{$r}++;
    }
    my @row;
    push( @row, ">" . $k );
    foreach my $a (@aa) {
        $count{$a} ||= 0;
        my $percentAA = sprintf( "%0.2f", $count{$a} / length( $seq{$k} ) );
        push( @row,
            $a . ":" . $count{$a} . "/" . length( $seq{$k} ) . "=" . sprintf( "%0.0f", $percentAA * 100 ) . "%" );
        $count{$a} = sprintf( "%0.2f", $count{$a} / length( $seq{$k} ) );

        # push(@row,$count{$a});
    }
    my $row = join( "\t\n", @row );
    print("$row\n\n");
}

Answer 1

像%seq這樣的hash沒有特定的順序。

Answer 2

數組保留順序，哈希按隨機順序排列。 如果要保留順序，則可以將鍵推到數組上，但是僅當鍵在哈希中不存在或得到重復項時才這樣做。

for(<S>) {
  my ($key,$value) = &parse($_);
  push @keys, $key unless exists $hash{$key};
  $hash{$key} = $value;
}

for my $key (@keys) {
  my $value = $hash{$key};

  ...
}

Answer 3

如果順序很重要，請不要使用哈希。

相反，我建議使用如下所示的數組數組：

#!/usr/bin/perl
#usage: perl seqComp.pl <input_fasta_file> > <output_file>
use strict;
use warnings;
use autodie;

my $file = shift or die "Usage: perl $0 <input_fasta_file> > <output_file>";
open my $fh, '<', $file;

my @fasta;

while (<$fh>) {    #Read the FASTA file.
    chomp;
    if (/>/) {
        push @fasta, [ $_, '' ];
    } else {
        $fasta[-1][1] .= $_;
    }
}

my @aa = qw(A R N D C Q E G H I L K M F P S T W Y V);

for (@fasta) {
    my ( $k, $seq ) = @$_;

    print "$k\n";

    my %count;    # a hash to hold the count for each amino acid in the protein
    $count{$_}++ for split '', $seq;

    for my $a (@aa) {
        $count{$a} ||= 0;
        printf "%s:%s/%s=%.0f%%\n", $a, $count{$a}, length($seq), 100 * $count{$a} / length($seq);
    }

    print "\n";
}

如何使用Perl順序讀取文件？

問題描述

3 個解決方案

解決方案1
0 2014-10-16 01:49:58

解決方案2
0 2014-10-16 02:04:36

解決方案3
0 已采納 2014-10-16 02:51:32

如何使用Perl順序讀取文件？

問題描述

3 個解決方案

解決方案1 0 2014-10-16 01:49:58

解決方案2 0 2014-10-16 02:04:36

解決方案3 0 已采納 2014-10-16 02:51:32

解決方案1
0 2014-10-16 01:49:58

解決方案2
0 2014-10-16 02:04:36

解決方案3
0 已采納 2014-10-16 02:51:32