[英]Perl - count words of a file
i want to count words in a file and want result the number of same word 我想计算文件中的单词数,并希望得到相同单词数的结果
my script 我的剧本
#!/usr/bin/perl
#use strict;
#use warnings;
use POSIX qw(strftime);
$datestring = strftime "%Y-%m-%d", localtime;
print $datestring;
my @files = <'/mnt/SESSIONS$datestring*'>;
my $latest;
foreach my $file (@files) {
$latest = $file if $file gt $latest;
}
@temp_arr=split('/',$latest);
open(FILE,"<$latest");
print "file loaded \n";
my @lines=<FILE>;
close(FILE);
#my @temp_line;
foreach my $line(@lines) {
@line=split(' ',$line);
#push(@temp_arr);
$line =~ s/\bNT AUTHORITY\\SYSTEM\b/NT__AUTHORITY\\SYSTEM/ig;
print $line;
#print "$line[0] $line[1] $line[2] $line[3] $line[4] $line[5] \n";
}
My log file 我的日志文件
SID USER TERMINAL PROGRAM
---------- ------------------------- --------------- -------------------------
1 SYSTEM titi toto (fifi)
2 SYSTEM titi toto (fofo)
4 SYSTEM titi toto (bobo)
5 NT_AUTHORITY\SYSTEM titi roro
6 NT_AUTHORITY\SYSTEM titi gaga
7 SYSTEM titi gogo (fifi)
5 rows selected.
I want result : 我想要结果:
User = 3 SYSTEM with program toto
, User = 1 SYSTEM with program gogo
Thanks for any information 感谢您提供任何信息
I see yours as a two-step problem -- you want to parse the log files, but then you also want to store elements of that data into a data structure that you can use to count. 我认为您的问题分为两个步骤-您想解析日志文件,但随后您还想将该数据的元素存储到可用于计数的数据结构中。
This is a guess, based on your sample data, but if your data is fixed-width, one way you can parse that into the fields is to use unpack
. 这是根据您的示例数据得出的猜测,但是如果您的数据是固定宽度的,则可以将其解析为字段的一种方法是使用
unpack
。 I think substr
might more efficient, so consider how many files you need to parse and how long each is. 我认为
substr
可能会更有效,因此请考虑您需要解析多少个文件以及每个文件有多长时间。
I would store the data into a hash and then dereference it after the files have all been read. 我会将数据存储到一个哈希中,然后在读取所有文件之后取消引用它。
my %counts;
open my $IN, '<', 'logfile.txt' or die;
while (<$IN>) {
next if length ($_) < 51;
my ($sid, $user, $terminal, $program) = unpack 'A9 @11 A25 @37 A15 @53 A25', $_;
next if $sid eq '---------'; # you need some way to filter out bogus or header rows
$program =~ s/\(.+//; # based on your example, turn toto (fifi) into toto
$counts{$user}{$program}++;
}
close $IN;
while (my ($user, $ref) = each %counts) {
while (my ($program, $count) = each %$ref) {
print "User = $count $user with program $program\n";
}
}
Output from program: 程序输出:
User = 3 SYSTEM with program toto
User = 1 SYSTEM with program gogo
User = 1 NT_AUTHORITY\SYSTEM with program roro
User = 1 NT_AUTHORITY\SYSTEM with program gaga
This code detect automatically the size of input fields (your snippet seems an output from Oracle query) and print the results: 此代码自动检测输入字段的大小(您的代码段似乎是Oracle查询的输出)并打印结果:
#!/usr/bin/perl
use strict;
use warnings;
use v5.10;
open my $file, '<', 'input.log' or die "$?";
my $data = {};
my @cols_size = ();
while (<$file>) {
my $line = $_;
if ( $line =~ /--/) {
foreach (split(/\s/, $line)) {
push(@cols_size, length($_) +1);
}
next;
}
next unless (@cols_size);
next if ($line =~ /rows selected/);
my ($sid, $user, $terminal, $program) = unpack('A' . join('A', @cols_size), $line);
next unless ($sid);
$program =~ s/\(\w+\)//;
$data->{$user}->{$program}++;
}
close $file;
foreach my $user (keys %{$data}) {
foreach my $program (keys %{$data->{$user}}) {
say sprintf("User = %s %s with program %s", $data->{$user}->{$program}, $user, $program);
}
}
我不了解$ counts {$ user} {$ program} ++++;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.