[英]How to split a file containing non-ascii characters into words, in bash?
[英]how to read NON-ASCII Char in the file as ASCII using AWK
我的输入文件:
000000000 vélIstine IOBAN 00000004960
000000000 shankargu kumar 00000000040
TTTTTTTTT 0000000200000000050000000000000000000000
每当我在上面的文件中有非 Ascii 字符时,
我下面的代码片段没有正确计算总和 (d_amt_sum+=substr($0,27,10)) 有时它会跳过该行,有时它给出不正确的值而不是 496 它返回 49 的 substr($0,27,10)?
除了我想知道如何在 AWK 中添加打印语句,例如我需要在 if 块中打印“substr($0,27,10)”的值怎么做?
set -A out_result -- `LC_ALL=en_US.UTF-8 awk 'BEGIN{
d_amt_sum=d_rec_count=d_trailer_out_amt_sum=d_trailer_rec_count=0;
}
{
if(substr($0,1,9) != "TTTTTTTTT")
{
d_amt_sum+=substr($0,27,10); d_rec_count+=1
}
else if(substr($0,1,9) == "TTTTTTTTT")
{
d_trailer_out_amt_sum+=substr($0,39,12);
d_trailer_rec_count+=substr($0,31,8);
}
}
END{print d_amt_sum, d_rec_count,d_trailer_out_amt_sum,d_trailer_rec_count}' ${OUTDIR}/${OUT_FILE}
Expected output
500,2,500,2
您对 if/else 语句的排序有逻辑错误,在检查 1 char 长度与 9 char 长度时出现另一个错误。 解决这两个问题...
awk '{k=substr($0,1,9)
if(k=="TTTTTTTTT")
{d_trailer_out_amt_sum+=substr($0,39,12)
d_trailer_rec_count+=substr($0,31,8)}
else if(k!="999999999")
{d_amt_sum+=substr($0,27,10);
d_rec_count+=1}}
END {print d_amt_sum, d_rec_count,d_trailer_out_amt_sum,d_trailer_rec_count}' file
500 2 500 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.