[英]Segregation of numerical results from a single text file to multiple files in linux
I have a data like this 我有这样的数据
#start
#gatherData
*ELEMENT_SHELL
48709 1 50614 50616 50618 50613
48710 1 50613 50618 50608 50609
48711 1 50616 50617 50619 50618
48712 1 50618 50619 50607 50608
48715 1 50589 50590 50620 50615
48716 1 50615 50620 50616 50614
48717 1 50590 50591 50621 50620
48721 1 50623 50625 50626 50622
48722 1 50622 50626 50610 50611
48723 1 50625 50614 50613 50626
*END
$PresentData
$RESULT OF strength
48709 1.0267261e-002
48710 1.0721873e-002
48711 1.1930415e-002
48712 1.2186395e-002
48715 9.7443219e-003
48716 1.0036242e-002
48717 1.1186538e-002
48721 7.9333931e-003
48722 8.6850608e-003
48723 8.9872172e-003
What I want to do is to check first of all the results under $RESULT OF strength 我想做的是首先检查$ RESULT OF强度下的所有结果
which numbers in the second column lie between 0 and 1e-002, then based on that search the number between *ELEMENT_SHELL AND *END and send the complete line to new text file test1.txt. 第二列中的数字位于0到1e-002之间,然后根据该搜索在* ELEMENT_SHELL和* END之间的数字并将完整的行发送到新的文本文件test1.txt。 If the number is between 1e-002 to 1e-003 to the next text file test2.txt and segregate this single file into two different files.
如果数字在1e-002到1e-003之间,则指向下一个文本文件test2.txt,并将该单个文件分成两个不同的文件。 Text1.text would have
Text1.text应该有
48709 1 50614 50616 50618 50613
48710 1 50613 50618 50608 50609
48711 1 50616 50617 50619 50618
48712 1 50618 50619 50607 50608
48716 1 50615 50620 50616 50614
48717 1 50590 50591 50621 50620
Text2.txt would have Text2.txt将具有
48721 1 50623 50625 50626 50622
48722 1 50622 50626 50610 50611
48723 1 50625 50614 50613 50626
48715 1 50589 50590 50620 50615
Can any expert suggest the way with SED, or AWk? 有专家可以建议使用SED或AWk吗? I think final results could be piped easily but the segregation from the same file and find it again is problematic.
我认为可以轻松地传递最终结果,但是从同一文件中分离并再次找到它是有问题的。 Thanks in advance
提前致谢
As a basic solution, consider the following code: 作为基本解决方案,请考虑以下代码:
[hamadhassan $] cat tri.awk
#!/usr/bin/gawk -f
BEGIN{
load_state=1;
}
$0=="$RESULT OF strength"{
# print "end of load state"
load_state=0;
}
load_state==1 && NF==6{
# print "storing "$0
lut[$1]=$0; # store line in look up table:
}
load_state==0 && NF==2{
if($2>0.0 && $2<1e-2){
if($1 in lut){
print lut[$1] > "Text2.txt";
}
}else{
if($1 in lut){
print lut[$1] > "Text1.txt";
}
}
}
[hamadhassan $]
which given your sample input: 给定您的示例输入:
[hamadhassan $] cat test.in
#start
#gatherData
*ELEMENT_SHELL
48709 1 50614 50616 50618 50613
48710 1 50613 50618 50608 50609
48711 1 50616 50617 50619 50618
48712 1 50618 50619 50607 50608
48715 1 50589 50590 50620 50615
48716 1 50615 50620 50616 50614
48717 1 50590 50591 50621 50620
48721 1 50623 50625 50626 50622
48722 1 50622 50626 50610 50611
48723 1 50625 50614 50613 50626
*END
$PresentData
$RESULT OF strength
48709 1.0267261e-002
48710 1.0721873e-002
48711 1.1930415e-002
48712 1.2186395e-002
48715 9.7443219e-003
48716 1.0036242e-002
48717 1.1186538e-002
48721 7.9333931e-003
48722 8.6850608e-003
48723 8.9872172e-003[hamadhassan $]
gives: 给出:
[hamadhassan $] ./tri.awk test.in
[hamadhassan $] cat Text2.txt
48715 1 50589 50590 50620 50615
48721 1 50623 50625 50626 50622
48722 1 50622 50626 50610 50611
48723 1 50625 50614 50613 50626
[hamadhassan $] cat Text1.txt
48709 1 50614 50616 50618 50613
48710 1 50613 50618 50608 50609
48711 1 50616 50617 50619 50618
48712 1 50618 50619 50607 50608
48716 1 50615 50620 50616 50614
48717 1 50590 50591 50621 50620
[hamadhassan $]
This was on CenTOS 6 with awk 3.1.7. 这是在Awk 3.1.7的CenTOS 6上进行的。
You can try with the following commands (assuming that the source file is txt.txt
: 您可以尝试使用以下命令(假设源文件是
txt.txt
:
grep "$RESULT OF strength" -A1000 txt.txt | awk '$2>0.01' | cut -f 1 | xargs -I{} grep {} txt.txt | egrep "[0-9]+[[:blank:]]+1[[:blank:]]+" > test1.txt
grep "$RESULT OF strength" -A1000 txt.txt | awk '$2<0.01' | cut -f 1 | xargs -I{} grep {} txt.txt | egrep "[0-9]+[[:blank:]]+1[[:blank:]]+" > test2.txt
If the columns are separated by spaces, then it would be: 如果列之间用空格隔开,则为:
grep "$RESULT OF strength" -A1000 txt.txt | sed 's/[\s]{2,}/\t/g' | awk '$2>0.01' | cut -f 1 -d' ' | xargs -I{} grep {} txt.txt | egrep "[0-9]+[[:blank:]]+1[[:blank:]]+" > test1.txt
grep "$RESULT OF strength" -A1000 txt.txt | sed 's/[\s]{2,}/\t/g' | awk '$2<0.01' | cut -f 1 -d' ' | xargs -I{} grep {} txt.txt | egrep "[0-9]+[[:blank:]]+1[[:blank:]]+" > test2.txt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.