[英]Modify input file in Perl
我編寫了一個Perl程序,它將2個文本文件作為輸入。
第一個文件包含具有此格式的序列和概率
good morning 0.5
第二個文件包含具有此格式概率的所有單詞
good 0.5
morning 0.6
我的腳本計算每個序列的公式
log( prob(sequence) / (prob(word1) - prob(sequence)) * (prob(word2) - prob(sequence)) )
問題是我有一些情況,其中prob(sequence)
與prob(word1)
或prob(word2)
所以我得到了Illegal division by zero
有沒有辦法在這些情況下通過添加小數來更改第二個文件中的值? (平滑)
#!/usr/bin/perl
use strict; ## PLE
use warnings;
my $inFile = "file1.txt";
my $outFile ="TEST.txt";
my %hashFR = getVocab("file2.txt");
my @result;
my $bloc = 50000;
my $cmp = 0;
open fileIn, "<$inFile" or die $!;
while (<fileIn>) {
chomp;
my $flag = 0;
my $ligne = $_;
my @words = getWords($ligne);
if (my $prob = pop @words) {
$prob =~ s/\(//g;
my $probWords = 1;
foreach my $word (@words) {
my $probWord;
if (exists $hashFR{$word}) {
$probWord = $hashFR{$word};
}
$probWords *= $probWord-$prob;
}
my $calc = $prob*log2($prob/($probWords));
my $result10 = sprintf("%.10f", $calc);
push @result, join(' ',@words) ." (".$result10.")\n";
}
}
#if(scalar(@result) == $bloc)
{
$cmp += $bloc;
print "$cmp lignes traités\n";
writeToResultFile($outFile,@result);
@result = ();
}
sub getWords {
my ($ligne) = $_;
my @words = split(' ', $ligne);
return @words;
}
sub getVocab {
my ( $filename ) = @_;
my %hash = ();
open fileVocab, "<$filename" or die $!;
while (<fileVocab>) {
chomp;
if (2 == (my($mot, $prob) = split( / / ))) {
$hash{trim($mot)} = trim($prob);
}
}
close fileVocab;
return %hash;
}
sub writeToResultFile {
my ($filename,@res) = @_;
open(INFO, ">>$filename");
foreach ( @res) {
print INFO $_;
}
close INFO
}
sub log2 {
my $n = shift;
return (log($n)/log(10))/(log(2)/log(10));
}
sub trim($) {
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
你可以像這樣使用異常處理:
my $calc
eval {
$calc = $prob*log2($prob/($probWords));
};
if ($@){
$calc = 0;#or whatever suits you
}
或者更簡單:
my $calc = eval { $prob*log2($prob/($probWords)) } // 'NaN';
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.