繁体   English   中英

AWK:有条件的恢复线

[英]Awk : recovery line with condition

我想通过仅在列中保留一些与模式匹配的行来从另一个文件创建文件。

基本文件的一部分:

"1","rs543921925","ENSG00000187634","ENST00000616125","intron_variant"
"2","rs543921925","ENSG00000187634","ENST00000620200","intron_variant"
"3","rs543921925","ENSG00000187634","ENST00000617307","intron_variant"
"4","rs146327803","ENSG00000187634","ENST00000420190","missense_variant"
"5","rs146327803","ENSG00000187634","ENST00000437963","missense_variant"
"6","rs146327803","ENSG00000187634","ENST00000342066","missense_variant"
"7","rs146327803","ENSG00000187634","ENST00000618181","missense_variant"

我想要的文件:

"4","rs146327803","ENSG00000187634","ENST00000420190","missense_variant"
"5","rs146327803","ENSG00000187634","ENST00000437963","missense_variant"
"6","rs146327803","ENSG00000187634","ENST00000342066","missense_variant"
"7","rs146327803","ENSG00000187634","ENST00000618181","missense_variant"

我试过了:

awk -F'"' '$9 ~ /missense_variant/ { print $0 }'base_file.txt

但这是行不通的。

我认为有时最好使用实际的文件定界符。

$ awk -F, '$NF=="\"missense_variant\"" base_file.txt

可能正是您的意图。

您本可以轻松地自己解决这个问题:

$ awk -F'"' 'NR==1{for (i=1; i<=NF; i++) print NF, i, "<" $i ">"}' file
11 1 <>
11 2 <1>
11 3 <,>
11 4 <rs543921925>
11 5 <,>
11 6 <ENSG00000187634>
11 7 <,>
11 8 <ENST00000616125>
11 9 <,>
11 10 <intron_variant>
11 11 <>

注意9美元对10美元的价格。

另外,请考虑将其用于FS:

$ awk -F'^"|","|"$' 'NR==1{for (i=1; i<=NF; i++) print NF, i, "<" $i ">"}' file
7 1 <>
7 2 <1>
7 3 <rs543921925>
7 4 <ENSG00000187634>
7 5 <ENST00000616125>
7 6 <intron_variant>
7 7 <>

要么:

$ awk -F'","' '{gsub(/^"|"$/,"")} NR==1{for (i=1; i<=NF; i++) print NF, i, "<" $i ">"}' file
5 1 <1>
5 2 <rs543921925>
5 3 <ENSG00000187634>
5 4 <ENST00000616125>
5 5 <intron_variant>

的确, awk脚本可以解决问题,但是grep更加容易和简单。

您的脚本错误是字段分隔符:

awk -F',' '$5 ~ /missense_variant/ { print }' base_file.txt

效果很好

但是grep更简单

grep "missense_variant\"$" input.txt
awk '/missense_variant/{print $0}' file

"4","rs146327803","ENSG00000187634","ENST00000420190","missense_variant"
"5","rs146327803","ENSG00000187634","ENST00000437963","missense_variant"
"6","rs146327803","ENSG00000187634","ENST00000342066","missense_variant"
"7","rs146327803","ENSG00000187634","ENST00000618181","missense_variant"

谢谢您的所有建议,它们工作得很好,我会看最适合我的问题的一个,谢谢

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM