[英]AWK array from split not sorted
我在变量ArtTEXT中有这个(演示)文本。
{1}: Reporting Problems and Bugs.
{2}: Other freely available awk implementations.
{3}: Summary of installation.
{4}: How to disable certain gawk extensions.
{5}: Making Additions To gawk.
{6}: Accessing the Git repository.
{7}: Adding code to the main body of gawk.
{8}: Porting gawk to a new operating system.
{9}: Why derived files are kept in the Git repository.
它是一个变量,其中行用缩进分隔。
indent = "\n\t\t\t";
我想循环遍历各行并检查每行中的内容。
所以我使用缩进将其拆分为数组
split(ArtTEXT,lin, indent);
然后我循环遍历数组lin
l = 0;
for (l in lin) {
print "l -- ", l, " lin[l] -- " ,lin[l] ;
}
我得到的是第4行开始的ArtTEXT系列
l -- 4 lin[l] -- {3}: Summary of installation.
l -- 5 lin[l] -- {4}: How to disable certain gawk extensions.
l -- 6 lin[l] -- {5}: Making Additions To gawk.
l -- 7 lin[l] -- {6}: Accessing the Git repository.
l -- 8 lin[l] -- {7}: Adding code to the main body of gawk.
l -- 9 lin[l] -- {8}: Porting gawk to a new operating system.
l -- 10 lin[l] -- {9}: Why derived files are kept in the Git repository.
l -- 1 lin[l] --
l -- 2 lin[l] -- {1}: Reporting Problems and Bugs.
l -- 3 lin[l] -- {2}: Other freely available awk implementations.
(原始文本在开头有一个空行。)
手册说明了拆分功能:
第一部分存储在数组[1]中,第二部分存储在数组[2]中,依此类推。
我该如何避免这个问题?
为什么会这样?
谢谢。
在awk中,数组是无序的。 如果他们碰巧按顺序出来,那是偶然的。
在GNU awk中,可以控制顺序。 例如,要通过索引获得数字排序,请使用PROCINFO["sorted_in"]="@ind_num_asc"
:
$ awk -v ArtTEXT="$(cat file)" 'BEGIN{PROCINFO["sorted_in"]="@ind_num_asc"; indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l in lin) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l -- 1 lin[l] -- {1}: Reporting Problems and Bugs.
l -- 2 lin[l] -- {2}: Other freely available awk implementations.
l -- 3 lin[l] -- {3}: Summary of installation.
l -- 4 lin[l] -- {4}: How to disable certain gawk extensions.
l -- 5 lin[l] -- {5}: Making Additions To gawk.
l -- 6 lin[l] -- {6}: Accessing the Git repository.
l -- 7 lin[l] -- {7}: Adding code to the main body of gawk.
l -- 8 lin[l] -- {8}: Porting gawk to a new operating system.
l -- 9 lin[l] -- {9}: Why derived files are kept in the Git repository.
或者,由于数组索引是数字的,我们可以用数字循环,使用for (l=1;l<=length(lin);l++) print...
:
$ awk -v ArtTEXT="$(cat file)" 'BEGIN{indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l=1;l<=length(lin);l++) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l -- 1 lin[l] -- {1}: Reporting Problems and Bugs.
l -- 2 lin[l] -- {2}: Other freely available awk implementations.
l -- 3 lin[l] -- {3}: Summary of installation.
l -- 4 lin[l] -- {4}: How to disable certain gawk extensions.
l -- 5 lin[l] -- {5}: Making Additions To gawk.
l -- 6 lin[l] -- {6}: Accessing the Git repository.
l -- 7 lin[l] -- {7}: Adding code to the main body of gawk.
l -- 8 lin[l] -- {8}: Porting gawk to a new operating system.
l -- 9 lin[l] -- {9}: Why derived files are kept in the Git repository.
多行显示的GNU代码如下所示:
awk -v ArtTEXT="$(cat file)" '
BEGIN{
PROCINFO["sorted_in"]="@ind_num_asc"
indent="\n\t\t\t"
split(ArtTEXT, lin, indent)
for (l in lin)
print "l -- ", l, " lin[l] -- " ,lin[l]
}'
而且,替代代码是:
awk -v ArtTEXT="$(cat file)" '
BEGIN{
indent="\n\t\t\t"
split(ArtTEXT, lin, indent)
for (l=1;l<=length(lin);l++)
print "l -- ", l, " lin[l] -- " ,lin[l]
}'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.