如何使用awk计算特定模式下特定条目的行数？

Question

我有一个文本文件，其样式如下所示

Sample1
Feature 1
A
B
C
Feature 2
A
G
H
L
Sample2
Feature 1
A
M
W
Feature 2
P
L

我正在尝试计算每个示例中每个功能的条目数。 所以我想要的输出应如下所示：

Sample1
Feature 1: 3
Feature 2: 4

Sample2
Feature 1: 3
Feature 2: 2

我尝试使用以下awk命令：

$ awk '{if(/^\Feature/){n=$0;}else{l[n]++}}
       END{for(n in l){print n" : "l[n]}}' inputfile.txt > result.txt

但这给了我以下输出

Feature 1: 6
Feature 2: 6

所以我想知道是否有人可以帮助我修改此命令以获得所需的输出或为我建议另一个命令？ （PS原始文件包含数百个样本和大约94个功能）

Answer 1

您可以使用以下awk ：

awk '/^Sample/{printf "%s%s",(c?c"\n":""),$0;c=0;next}
     /^Feature/{printf "%s\n%s: ",(c?c:""),$0;c=0;next}
     {c++}
     END{print c}' file

脚本仅对不以Sample或Feature开头的行增加计数器c 。

如果找到2个关键字之一，则会打印计数器。

Answer 2

跟随awk可能会帮助您。

awk '
/^Sample/ && count1 && count2{
   print "Feature 1:",count1 ORS "Feature 2:",count2;
   count1=count2=flag1=flag2=""}
/^Sample/{
   print;
   flag=1;
   next}
flag && /^Feature/{
   if($NF==1){ flag1=1 };
   if($NF==2){ flag2=1;
               flag1=""};
   next}
flag && flag1{ count1++ }
flag && flag2{ count2++ }
END{
   if(count1 && count2){
      print "Feature 1:",count1 ORS "Feature 2:",count2}
}'  Input_file

输出如下。

Sample1
Feature 1: 3
Feature 2: 4
Sample2
Feature 1: 3
Feature 2: 2

Answer 3

这个awk也可以工作：

awk '/^Sample/ {
   for (i in a)
      print i ": " a[i]
   print
   delete a
   next
}
/^Feature/ {
   f = $0
   next
}
{
   ++a[f]
}
END {
   for (i in a) 
      print i ": " a[i]
}' file

Sample1
Feature 1: 3
Feature 2: 4
Sample2
Feature 1: 3
Feature 2: 2

Answer 4

$ cat tst.awk
BEGIN { OFS = ": " }
/Sample/  { prtFeat(); print (NR>1 ? ORS : "") $0; next }
/Feature/ { prtFeat(); name=$0; next }
{ ++cnt }
END { prtFeat() }
function prtFeat() {
    if (cnt) {
        print name, cnt
        cnt = 0
    }
}

$ awk -f tst.awk file
Sample1
Feature 1: 3
Feature 2: 4

Sample2
Feature 1: 3
Feature 2: 2

如何使用awk计算特定模式下特定条目的行数？

问题描述

4 个解决方案

解决方案1
1 已采纳 2018-06-04 08:30:26

解决方案2
0 2018-06-04 08:20:54

解决方案3
0 2018-06-04 08:26:29

解决方案4
0 2018-06-04 14:00:32

如何使用awk计算特定模式下特定条目的行数？

问题描述

4 个解决方案

解决方案1 1 已采纳 2018-06-04 08:30:26

解决方案2 0 2018-06-04 08:20:54

解决方案3 0 2018-06-04 08:26:29

解决方案4 0 2018-06-04 14:00:32

解决方案1
1 已采纳 2018-06-04 08:30:26

解决方案2
0 2018-06-04 08:20:54

解决方案3
0 2018-06-04 08:26:29

解决方案4
0 2018-06-04 14:00:32