简体   繁体   English

通过AWK脚本写入输出文件时出错

[英]Error in writing output file through AWK scripting

I have a AWK script to write specific values matching with specific pattern to a .csv file. 我有一个AWK脚本,可将与特定模式匹配的特定值写入.csv文件。 The code is as follows: 代码如下:

BEGIN{print "Query Start,Query End, Target Start, Target End,Score, E,P,GC"}
/^\>g/ { Query=$0 }
 /Query =/{
    split($0,a," ")
    query_start=a[3]
    query_end=a[5]
    query_end=gsub(/,/,"",query_end)
    target_start=a[8]
    target_end=a[10]
    }
    /Score =/{
    split($0,a," ")
    score=a[3]
    score=gsub(/,/,"",score)
    e=a[6]
    e=gsub(/,/,"",e)
    p=a[9]
    p=gsub(/,/,"",p)
    gc=a[12]

    printf("%s,%s,%s,%s,%s,%s,%s,%s\n",query_start, query_end,target_start,target_end,score,e,p,gc)
    }

The input file is as follows: 输入文件如下:

>gi|ABCDEF|

 Plus strand results:

 Query = 100 - 231, Target = 100 - 172
 Score = 20.92, E = 0.01984, P = 4.309e-08, GC =  51

But I received the output in a .csv file as provided below: 但是我收到了.csv文件中的输出,如下所示:

100 0   100 172 0   0   0   51

The program failed to copy the values of: Query end Score EP (Note: all the failed values are present before comma (,)) 程序无法复制以下值:查询结束得分EP(注意:所有失败的值都在逗号(,)之前出现)

Any help to obtain the right output will be great. 获得正确输出的任何帮助将是巨大的。

Best regards, 最好的祝福,

Amit 阿米特

As @Jidder mentioned, you don't need to call split() and as @jaypal mentioned you're using gsub() incorrectly, but also you don't need to call gsub() at all if you just include , in your FS. 如@Jidder所述,您无需调用split(),而如@jaypal所述,您使用的gsub()错误,但是如果您只包含了,也根本不需要调用gsub()。 FS。

Try this: 尝试这个:

BEGIN {
    FS = "[[:space:],]+"
    OFS = ","
    print "Query Start","Query End","Target Start","Target End","Score","E","P","GC"
}
/^\>g/ { Query=$0 }
/Query =/ {
    query_start=$4
    query_end=$6
    target_start=$9
    target_end=$11
}
/Score =/ {
    score=$4
    e=$7
    p=$10
    gc=$13

    print query_start,query_end,target_start,target_end,score,e,p,gc
}

That work? 那工作吗 Note the field numbers are bumped out by 1 because when you don't use the default FS awk no longer skips leading white space so there's an empty field before the white space in your input. 请注意,字段编号被1淘汰,因为当您不使用默认的FS awk时,不再跳过前导空白,因此输入中的空白之前会有一个空白字段。

Obviously, you are not using your Query variable so the line that populates it is redundant. 显然,您没有使用Query变量,因此填充它的行是多余的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM