繁体   English   中英

拆分的AWK数组未排序

[英]AWK array from split not sorted

我在变量ArtTEXT中有这个(演示)文本。

{1}: Reporting Problems and Bugs. 
{2}: Other freely available awk implementations. 
{3}: Summary of installation. 
{4}: How to disable certain gawk extensions. 
{5}: Making Additions To gawk. 
{6}: Accessing the Git repository. 
{7}: Adding code to the main body of gawk. 
{8}: Porting gawk to a new operating system.  
{9}: Why derived files are kept in the Git repository. 

它是一个变量,其中行用缩进分隔。

indent = "\n\t\t\t";

我想循环遍历各行并检查每行中的内容。

所以我使用缩进将其拆分为数组

split(ArtTEXT,lin, indent);

然后我循环遍历数组lin

l = 0;
for (l in lin) {
    print "l -- ", l, " lin[l] -- " ,lin[l] ;
}

我得到的是第4行开始的ArtTEXT系列

l --  4  lin[l] --  {3}: Summary of installation. 
l --  5  lin[l] --  {4}: How to disable certain gawk extensions. 
l --  6  lin[l] --  {5}: Making Additions To gawk. 
l --  7  lin[l] --  {6}: Accessing the Git repository. 
l --  8  lin[l] --  {7}: Adding code to the main body of gawk. 
l --  9  lin[l] --  {8}: Porting gawk to a new operating system.  
l --  10  lin[l] --  {9}: Why derived files are kept in the Git repository. 
l --  1  lin[l] --   
l --  2  lin[l] --  {1}: Reporting Problems and Bugs. 
l --  3  lin[l] --  {2}: Other freely available awk implementations. 

(原始文本在开头有一个空行。)

手册说明了拆分功能:

第一部分存储在数组[1]中,第二部分存储在数组[2]中,依此类推。

我该如何避免这个问题?

为什么会这样?

谢谢。

在awk中,数组是无序的。 如果他们碰巧按顺序出来,那是偶然的。

在GNU awk中,可以控制顺序。 例如,要通过索引获得数字排序,请使用PROCINFO["sorted_in"]="@ind_num_asc"

$ awk -v ArtTEXT="$(cat file)" 'BEGIN{PROCINFO["sorted_in"]="@ind_num_asc"; indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l in lin) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l --  1  lin[l] --  {1}: Reporting Problems and Bugs. 
l --  2  lin[l] --  {2}: Other freely available awk implementations. 
l --  3  lin[l] --  {3}: Summary of installation. 
l --  4  lin[l] --  {4}: How to disable certain gawk extensions. 
l --  5  lin[l] --  {5}: Making Additions To gawk. 
l --  6  lin[l] --  {6}: Accessing the Git repository. 
l --  7  lin[l] --  {7}: Adding code to the main body of gawk. 
l --  8  lin[l] --  {8}: Porting gawk to a new operating system.  
l --  9  lin[l] --  {9}: Why derived files are kept in the Git repository. 

或者,由于数组索引是数字的,我们可以用数字循环,使用for (l=1;l<=length(lin);l++) print...

$ awk -v ArtTEXT="$(cat file)" 'BEGIN{indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l=1;l<=length(lin);l++) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l --  1  lin[l] --  {1}: Reporting Problems and Bugs. 
l --  2  lin[l] --  {2}: Other freely available awk implementations. 
l --  3  lin[l] --  {3}: Summary of installation. 
l --  4  lin[l] --  {4}: How to disable certain gawk extensions. 
l --  5  lin[l] --  {5}: Making Additions To gawk. 
l --  6  lin[l] --  {6}: Accessing the Git repository. 
l --  7  lin[l] --  {7}: Adding code to the main body of gawk. 
l --  8  lin[l] --  {8}: Porting gawk to a new operating system.  
l --  9  lin[l] --  {9}: Why derived files are kept in the Git repository. 

多行版本

多行显示的GNU代码如下所示:

awk -v ArtTEXT="$(cat file)" '
BEGIN{
    PROCINFO["sorted_in"]="@ind_num_asc"
    indent="\n\t\t\t"
    split(ArtTEXT, lin, indent)
    for (l in lin)
        print "l -- ", l, " lin[l] -- " ,lin[l]
}'

而且,替代代码是:

awk -v ArtTEXT="$(cat file)" '
BEGIN{
    indent="\n\t\t\t"
    split(ArtTEXT, lin, indent)
    for (l=1;l<=length(lin);l++)
        print "l -- ", l, " lin[l] -- " ,lin[l]
}'

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM