[英]Storing table in data structure with perl
我正在考慮如何將下表存儲到復雜的數據結構中,以及使用哪種數據結構。 輸入是從Excel派生的制表符分隔的文本文件。 請注意,某些單元格為空(在這種情況下為“ RQ Max”)。 這是桌子:
Well Sample Name Target Name RQ Max Ct Mean
1 Sample 1 actin 20,514
2 Sample 1 claudin 30,544
3 Sample 1 occludin 31,183
25 Sample 1 actin 20,514
26 Sample 1 claudin 30,544
27 Sample 1 occludin 31,183
49 Sample 2 actin 20,416
50 Sample 2 claudin 25,611
51 Sample 2 occludin 27,831
73 Sample 2 actin 20,416
74 Sample 2 claudin 25,611
75 Sample 2 occludin 27,831
97 Sample 3 actin 24,213
98 Sample 3 claudin 32,065
99 Sample 3 occludin 34,556
194 H2O claudin
195 H2O occludin
217 H2O actin
218 H2O claudin
219 H2O occludin
這是我的代碼:
#! usr/bin/perl
use strict;
use warnings;
# CHECK FOR CORRECT USAGE
unless (@ARGV == 1){
die "Usage: perl $0 \"file.txt\"\n";
}
my $input = "$ARGV[0]";
#chomp ($input);
open (READ, $input) || die "Cannot open $input: $!\n";
my $line = '';
my %data;
while ($line = <READ>){
chomp $line;
if ($line =~ m/^[0-9]/i);
$i++;
$data{"$i"} = [ split /\t{1}/, $line ];
}
}
如您所見,由於我不確定要使用哪種結構,因此我只是程序的開始。 實際上,我只需要整個表的三列,即“樣本名稱”,“目標名稱”和“ Ct均值”。 稍后我想為每個樣本計算一些內容時,將它們作為鍵可能會有所幫助。 在哈希結構的哈希中,我想將目標名稱作為“第二個鍵”。 有人可以將我推向正確的方向嗎? 我目前正在努力存儲數據,因為我已經很長時間沒有使用過perl了...
這是我最后想要的:
%data = (
Sample 1 => {
actin => 20.514,
claudin => 30.544,
occludin => 31.183,
},
Sample 2 => {
actin => 20.416,
claudin => 25.611,
occludin => 27.831,
},
...
);
因此,如果要從命令行中指定的文件中讀取文件,則有幾點要點:
while ( <> ) {
后者可以讀取STDIN 或在命令行上指定的文件。 確切地說,您將獲得sed / grep。
第二步-您可以使用哈希切片來分析制表符分隔的日期。
因此,假設您正在考慮僅提取CT_Mean:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %results;
#read header row
chomp ( my @header = split /\t/, <> );
#tidy up leading whitespace in the fields (there's some in your example data)
s/^\s+// for @header;
#iterate the rest of STDIN or files on command line.
while ( <> ) {
#remove trailing linefeed.
chomp;
#tidy up leading whitespace again.
s/^\s+//g;
my %row;
#use hash slice to read key-value.
@row{@header} = split /\t/;
#print for debug
print Dumper \%row;
#skip the H2O lines.
next if $row{'Sample Name'} eq 'H2O';
#Cosmetic assignments - could rewrite to a single one
my $sample_name = $row{'Sample Name'};
my $ct_mean = $row{'Ct Mean'};
my $target_name = $row{'Target Name'};
$results{$sample_name}{$target_name} = $ct_mean;
}
print Dumper \%results;
給你:
$VAR1 = {
'Sample 2' => {
'occludin' => '27,831',
'actin' => '20,416',
'claudin' => '25,611'
},
'Sample 3' => {
'occludin' => '34,556',
'actin' => '24,213',
'claudin' => '32,065'
},
'Sample 1' => {
'claudin' => '30,544',
'occludin' => '31,183',
'actin' => '20,514'
}
};
(注意-哈希明顯是無序的)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.