簡體   English   中英

修改Perl中的輸入文件

[英]Modify input file in Perl

我編寫了一個Perl程序,它將2個文本文件作為輸入。

第一個文件包含具有此格式的序列和概率

good morning 0.5

第二個文件包含具有此格式概率的所有單詞

good 0.5
morning 0.6

我的腳本計算每個序列的公式

log( prob(sequence) / (prob(word1) - prob(sequence)) * (prob(word2) - prob(sequence)) )

問題是我有一些情況,其中prob(sequence)prob(word1)prob(word2)所以我得到了Illegal division by zero

有沒有辦法在這些情況下通過添加小數來更改第二個文件中的值? (平滑)

#!/usr/bin/perl
use strict; ## PLE
use warnings;

my $inFile = "file1.txt";
my $outFile ="TEST.txt";
my %hashFR = getVocab("file2.txt");
my @result;

my $bloc = 50000;
my $cmp = 0;

open fileIn, "<$inFile" or die $!;
while (<fileIn>) {
    chomp;
    my $flag = 0;
    my $ligne = $_;
    my @words = getWords($ligne);
    if (my $prob = pop @words) {
        $prob  =~ s/\(//g;
        my $probWords = 1;

        foreach my $word (@words) {
            my $probWord;
            if (exists $hashFR{$word}) {
                $probWord = $hashFR{$word};
            }
            $probWords *= $probWord-$prob;
        }

        my $calc = $prob*log2($prob/($probWords));
        my $result10 = sprintf("%.10f", $calc);
        push @result, join(' ',@words) ." (".$result10.")\n";
    }
}

#if(scalar(@result) == $bloc)
{
    $cmp += $bloc;
    print "$cmp lignes traités\n";
    writeToResultFile($outFile,@result);
    @result = ();
}

sub getWords {
    my ($ligne) = $_;

    my @words = split(' ', $ligne);

    return @words;
}

sub getVocab {
    my ( $filename ) = @_;
    my %hash = ();

    open fileVocab, "<$filename" or die $!;
    while (<fileVocab>) {
        chomp;

        if (2 == (my($mot, $prob) = split( / / ))) {
            $hash{trim($mot)} = trim($prob);
        }
    }
    close fileVocab;
    return %hash;
}

sub writeToResultFile {
    my ($filename,@res) = @_;
    open(INFO, ">>$filename");
    foreach ( @res) {
        print INFO $_;
    }
    close INFO
}
sub log2 {
    my $n = shift;
    return (log($n)/log(10))/(log(2)/log(10));
}

sub trim($) {
    my $string = shift;
    $string =~ s/^\s+//;
    $string =~ s/\s+$//;
    return $string;
}

你可以像這樣使用異常處理:

my $calc
eval {
 $calc = $prob*log2($prob/($probWords));
};
if ($@){
  $calc = 0;#or whatever suits you
}

或者更簡單:

my $calc = eval { $prob*log2($prob/($probWords)) } // 'NaN';

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM