简体   繁体   English

计算每行文件中的模式?

[英]Count pattern occurrence in each line of file?

My file looks like this: 我的文件看起来像这样:

id12 ack dko hhhh chfl dkl dll chfl
id14 slo ksol chfl dloo
id13 mse
id23 clos chfl dll alo

grep -c 'chfl' filename , gives me the number of occurrence of chfl , but I want to count occurrence of chfl per line. grep -c 'chfl' filename ,给出了chfl的出现chfl ,但我想计算每行chfl出现次数。 Like this: 像这样:

id12 2
id14 1
id13 0
id23 1

Also how do I do the same with two patterns to match? 另外我如何使用两种模式来匹配? Like chfl and dll ? chfldll

perl -lane 'undef $c;
            for(@F){$c++ if(/^chfl$/)};
            print "$F[0] ",$c?$c:"0"' your_file

Or simply: 或者干脆:

perl -lane '$c=0;
            for(@F){$c++ if(/^chfl$/)};
            print "$F[0] $c"' your_file

Tested below: 测试如下:

> cat temp
id12 ack dko hhhh chfl dkl dll chfl
id14 slo ksol chfl dloo
id13 mse
id23 clos chfl dll alo
> perl -lane '$c=0;for(@F){$c++ if(/^chfl$/)};print "$F[0] $c"' temp
id12 2
id14 1
id13 0
id23 1
> 

Also in awk:(Logic here remains the same as above one in perl) 同样在awk中:(这里的逻辑与perl中的上面一样)

awk '{a=0;
     for(i=1;i<=NF;i++)if($i~/chfl/)a++;
     print $1,a}' your_file

A Perl version that copes with multiple strings. 一个处理多个字符串的Perl版本。

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

die "Usage: $0 pattern [pattern ...] file\n" unless @ARGV > 1;

my @patterns;
until (@ARGV == 1) {
  push @patterns, shift;
}

my $re = '(' . join('|', map { "\Q$_\E" } @patterns) . ')';

my %match;
while (<>) {
  if (my @matches = /$re/g) {
    $match{$_}++ for @matches;
  }
}

say "$_: $match{$_}" for sort keys %match;

A couple of test runs: 几个测试运行:

$ ./cgrep chfl dll cgrep.txt 
chfl: 4
$ ./cgrep chfl dll cgrep.txt 
chfl: 4
dll: 2

How about: 怎么样:

my %res;
while(<DATA>) {
    chomp;
    my ($id,$rest) = $_ =~ /^(\S+)(.*)$/;
    $res{chfl}{$id} =()= $rest =~ /(chfl)/g;
    $res{dll}{$id} =()= $rest =~ /(dll)/g;
}
say Dumper\%res;

__DATA__
id12 ack dko hhhh chfl dkl dll chfl
id14 slo ksol chfl dloo
id13 mse
id23 clos chfl dll alo

output: 输出:

$VAR1 = {
          'dll' => {
                     'id13' => 0,
                     'id12' => 1,
                     'id23' => 1,
                     'id14' => 0
                   },
          'chfl' => {
                      'id13' => 0,
                      'id12' => 2,
                      'id23' => 1,
                      'id14' => 1
                    }
        };

Use this: 用这个:

awk 'BEGIN {print "id\tchfl\tdll\n--------------------"}{c=d=i=0;while(i++<NF){if($i=="chfl")c++; if($i=="dll")d++}; print $1,c,d}' OFS="\t" file
id      chfl    dll
--------------------
id12    2       1
id14    1       0
id13    0       0
id23    1       1

bash one liner with grep: 用grep打一个班轮:

while read line ; do echo $line | grep -o 'chfl' | wc -l  ; done < your_file

-o outputs every occurence on a new line and wc counts them. -o输出新行上的每个出现,wc对它们进行计数。

Edit for multiple patterns: 编辑多个模式:

patterns=(chfl dll)

while read line ; do
    for pattern in ${patterns[@]} ; do
        echo -ne $pattern"\t" ; echo $line | grep -o $pattern | wc -l 
    done
done < your_file

Another version of awk : 另一个版本的awk

$ awk '{c1=gsub(var1,x);c2=gsub(var2,x);print $1,var1"="c1,var2"="c2}' var1="chfl" var2="dll"  file
id12 chfl=2 dll=1
id14 chfl=1 dll=0
id13 chfl=0 dll=0
id23 chfl=1 dll=1

Just pass the variables you want to count at the end of the file. 只需在文件末尾传递要计数的变量即可。

你可以用这个awk

awk '{d=c=0;for(i=1;i<=NF;i++){ if($i ~ /chfl/)c++; if($i ~ /dll/)d++;} print $1,c,d}' yourfile
perl -ne 'my $c=s/chfl//g||0;my $d=s/dll//g||0;s/ .*//s;print "$_ chfl $c dll $d\n"' file

Explanation: 说明:

  • s///g in scalar context returns the number of substitutions made s///g在标量上下文中返回所做的替换次数
  • ||0 make sure the variable is set to zero if there are no matches ||0如果没有匹配项,请确保将变量设置为零
  • s/ .*//s throws away everything from the 1st space from $_ , leaving the id only s/ .*//s抛弃$_的第一个空格中的所有内容,仅保留id

It will produce the following output: 它将产生以下输出:

id12 chfl 2 dll 1
id14 chfl 1 dll 0
id13 chfl 0 dll 0    
id23 chfl 1 dll 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM