[英]Calculate proportion of specific letters in a file using AWK
First of all, thank you for helping with my problem.首先,感谢您帮助解决我的问题。 I have a csv file where I have a column that have different letter encoded, as you can see here.
我有一个 csv 文件,其中我有一个具有不同字母编码的列,如您在此处看到的。
ABC
CD
EF
F
D
F
AS
I know how to calculate the proportion of F in the column and display the output as follows:我知道如何计算F在列中的比例并显示output如下:
awk 'BEGIN{sum=0;count=0}{count++}{if($1="F")sum+=$1}END {print sum/count}'
0.286
But my problem is that I want to do it generally with every single line of the column that have one character so the output will be:但我的问题是,我想通常对具有一个字符的列的每一行执行此操作,因此 output 将是:
F 0.286
D 0.143
Thank you everyone for your help.感谢大家的帮助。
Instead of using if ($1=="F")
to filter out a specific character, store all single letters in an array and iterate over them at the end.与其使用
if ($1=="F")
过滤掉特定字符,不如将所有单个字母存储在一个数组中并在最后遍历它们。
awk 'length($0)==1{a[$0]++} END{for(c in a) print c, a[c]/NR}'
Arrays in awk
have no fixed iteration order, so the output might be either awk 中的
awk
没有固定的迭代顺序,因此 output 可能是
F 0.286
D 0.143
or或者
D 0.143
F 0.286
If you want a fixed order, pipe the result through sort
.如果你想要一个固定的顺序, pipe 通过
sort
得到结果。
With your shown samples, could you please try following.使用您显示的示例,您能否尝试以下操作。
awk '
/^.$/{
count[$0]++
}
END{
for(key in count){
print key,count[key]/FNR
}
}
' Input_file
Explanation: Adding detailed explanation for above.说明:为上述添加详细说明。
awk ' ##Starting awk program from here.
/^.$/{ ##Checking condition if line is having 1 character long.
count[$0]++ ##Creating count with index of current line and keep increasing count of it with 1 here.
}
END{ ##Starting END block of this program from here.
for(key in count){ ##Traversing through count array here.
print key,count[key] ##printing key and printing divide of value of count with FNR.
}
}
' Input_file ##Mentioning Input_file name here.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.