尝试比较两个文件的“词性”标签并将匹配的标签打印在单独的文件中

Question

我正在尝试编写一个 perl 程序来比较两个文本文件的“词性”标签，并在 Windows 的单独文件中打印匹配的标签以及相应的单词。

File1:
boy N
went V
loves V
girl N
File2:
boy N
swims V
girl N
loves V

预期输出：男孩 NN 女孩 NN 喜欢 VV

列由制表符分隔。 到目前为止我所做的编码：

use strict;
use warnings;

my $filename = 'file1.txt';
open(my $fh, $filename)
  or die "Could not open file '$filename'";

while (my $row = <$fh>) {
  chomp $row;
  print "$row\n";
}
my $tagfile = 'file2.txt';
open(my $tg, $tagfile)
  or die "Could not open file '$filename'";
while (my $row = <$tg>) {
    chomp $row;
    print "$row\n";
    }

Answer 1

真的不清楚你在问什么。 但我认为这很接近。

#!/usr/bin/perl

use strict;
use warnings;

my ($file1, $file2) = @ARGV;

my %words; # Keep details of the words
while (<>) { # Read all input files a line at a time
  chomp;
  my ($word, $pos) = split;
  $words{$ARGV}{$word}{$pos}++;

  # If we're processing file1 then don't look for a match
  next if $ARGV eq $file1;

  if (exists $words{$file1}{$word}{$pos}) {
     print join(' ', $word, ($pos) x 2), "\n";
  }
}

像这样运行它：

./pos file1 file2

给出：

boy N N
girl N N
loves V V

Answer 2

好的，首先你想要的是一个hash 。

你需要：

读取第一个文件，将其拆分为“word”和“pos”。
将其保存在哈希中
读取第二个文件，将每一行拆分为“word”和“pos”。
将它与您填充的散列进行比较，并检查它是否匹配。

像这样的东西：

#!/usr/bin/env perl 
use strict;
use warnings;

#declare our hash:

my %pos_for;


#open the first file
my $filename = 'file1.txt';
open( my $fh, '<', $filename ) or die "Could not open file '$filename'";

while (<$fh>) {
    #remove linefeed from this line.
    #note - both chomp and split default to using $_ which is defined by the while loop.
    chomp;

    #split it on whitespace.
    my ( $word, $pos ) = split;

    #record this value in the hash %pos_for
    $pos_for{$word} = $pos;
}
close($fh);

#process second file:

my $tagfile = 'file2.txt';
open( my $tg, '<', $tagfile ) or die "Could not open file '$filename'";
while (<$tg>) {

    #remove linefeed from this line.
    chomp;

    #split it on whitespace.
    my ( $word, $pos ) = split;

    #check if this word was in the other file
    if (defined $pos_for{$word}
        #and that it's the same "pos" value.
        and $pos_for{$word} eq $pos
        )
    {
        print "$word $pos\n";
    }
}
close($tg);

尝试比较两个文件的“词性”标签并将匹配的标签打印在单独的文件中

问题描述

2 个解决方案

解决方案1
0 已采纳 2015-10-01 15:39:34

解决方案2
0 2015-10-01 15:46:47

尝试比较两个文件的“词性”标签并将匹配的标签打印在单独的文件中

问题描述

2 个解决方案

解决方案1 0 已采纳 2015-10-01 15:39:34

解决方案2 0 2015-10-01 15:46:47

解决方案1
0 已采纳 2015-10-01 15:39:34

解决方案2
0 2015-10-01 15:46:47