使用 AWK 计算文件中特定字母的比例

Question

First of all, thank you for helping with my problem.首先，感谢您帮助解决我的问题。 I have a csv file where I have a column that have different letter encoded, as you can see here.我有一个 csv 文件，其中我有一个具有不同字母编码的列，如您在此处看到的。

ABC
CD
EF
F
D
F
AS

I know how to calculate the proportion of F in the column and display the output as follows:我知道如何计算F在列中的比例并显示output如下：

awk 'BEGIN{sum=0;count=0}{count++}{if($1="F")sum+=$1}END {print sum/count}'
0.286

But my problem is that I want to do it generally with every single line of the column that have one character so the output will be:但我的问题是，我想通常对具有一个字符的列的每一行执行此操作，因此 output 将是：

F 0.286
D 0.143

Thank you everyone for your help.感谢大家的帮助。

Answer 1

Instead of using if ($1=="F") to filter out a specific character, store all single letters in an array and iterate over them at the end.与其使用if ($1=="F")过滤掉特定字符，不如将所有单个字母存储在一个数组中并在最后遍历它们。

awk 'length($0)==1{a[$0]++} END{for(c in a) print c, a[c]/NR}'

Arrays in awk have no fixed iteration order, so the output might be either awk 中的awk没有固定的迭代顺序，因此 output 可能是

F 0.286
D 0.143

or或者

D 0.143
F 0.286

If you want a fixed order, pipe the result through sort .如果你想要一个固定的顺序， pipe 通过sort得到结果。

Answer 2

With your shown samples, could you please try following.使用您显示的示例，您能否尝试以下操作。

awk '
/^.$/{
  count[$0]++
}
END{
  for(key in count){
    print key,count[key]/FNR
  }
}
' Input_file

Explanation: Adding detailed explanation for above.说明：为上述添加详细说明。

awk '                    ##Starting awk program from here.
/^.$/{                   ##Checking condition if line is having 1 character long.
  count[$0]++            ##Creating count with index of current line and keep increasing count of it with 1 here.
}
END{                     ##Starting END block of this program from here.
  for(key in count){     ##Traversing through count array here.
    print key,count[key]  ##printing key and printing divide of value of count with FNR.
  }
}
' Input_file             ##Mentioning Input_file name here.

使用 AWK 计算文件中特定字母的比例

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-04-24 16:32:17

解决方案2
2 2021-04-24 16:42:52

使用 AWK 计算文件中特定字母的比例

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-04-24 16:32:17

解决方案2 2 2021-04-24 16:42:52

解决方案1
2 已采纳 2021-04-24 16:32:17

解决方案2
2 2021-04-24 16:42:52