简体   繁体   English

从子文件中的母文件中搜索编号,并将子文件中的整行附加回母文件中

[英]Search a number from mother files in daughter file and append the complete line from daughter file back in mother file

I have a file File1.txt with data in the following form 我有一个文件File1.txt,数据格式如下

*ELEMENT_SHELL
$#   eid     pid      n1      n2      n3      n4      n5      n6         
 46573       1   48206   48210   48217   48205       0       0           
 46574       1   48205   48217   48218   48204       0       0
............................................................
.............................................................   

I want to search for the number in the third column like 48206 and search it from the other file File2.txt which has a format like under 我想在第三列中搜索数字,例如48206,然后从其他文件File2.txt中搜索,其格式如下

text
text
48205       54.995392  -1.847287e-009      149.449997       0       0
48206       55.995308  -1.879442e-009      149.449997       0       0
48207       56.995224  -1.911598e-009      149.449997       0       0
text
48208       56.995224  -1.911598e-009      149.449997       0       0
...

and put the complete line along with the number back in the first file and append it at the end. 并将完整的行以及数字放回第一个文件中,并在末尾附加。 so that File1.text will look like 这样File1.text看起来像

*ELEMENT_SHELL
$#   eid     pid      n1      n2      n3      n4      n5      n6      
46573       1   48206   48210   48217   48205       0       0       
46574       1   48205   48217   48218   48204       0       0       
....................................................
............................................................
48206       55.995308  -1.879442e-009      149.449997       0       0

Any suggestion with SED or AWK? 对SED或AWK有什么建议吗?

With awk: 使用awk:

awk 'NR == FNR { print; if(NR > 2) { seen[$3] = 1 }; next } seen[$1]' file1 file2

The code works as follows: 该代码的工作方式如下:

NR == FNR {       # while processing the first file
  print           # print the line (echoing file1 fully)
  if(NR > 2) {    # from the second line onward
    seen[$3] = 1  # remember the third fields you saw
  }
  next            # don't do anything else.
}
seen[$1]          # while processing the second file: select lines
                  # whose first field is one of the remembered fields.

You can then redirect the output of this to another file and replace file1 with that file afterwards: 然后,您可以将此输出重定向到另一个文件,然后用该文件替换file1:

awk 'NR == FNR { print; if(NR > 2) { seen[$3] = 1 }; next } seen[$1]' file1 file2 > file1.new && mv file1.new file1

You can get the field from file1 with awk , then find the matching lines in file2 with grep and simply concatenate the files afterwards. 您可以使用awk从file1中获取字段,然后使用grep在file2中找到匹配的行,然后简单地将文件连接起来。

while read LINE
do
    SEARCHSTR=$(echo $LINE | awk '{print $3}')  
    grep "$SEARCHSTR" file2.txt >>append.txt
done < file1.txt

cat file1.txt append.txt >file1_append.txt

To have grep search for the content of different fields, a regex has to be constructed to contain them all, ie 为了让grep搜索不同字段的内容,必须构造一个正则表达式来包含所有内容,即

SEARCHSTR=$(echo $LINE | awk '{BEGIN {OFS="|";} print $3, $4, $5}')  
grep "($SEARCHSTR)" file2.txt >>append.txt

Here $SEARCHSTR would already contain the content of the fields seperated by | 在这里, $SEARCHSTR已经包含由|分隔的字段的内容。 .

Concerning speed: if the files have their columns at fixed positions, you could use cut instead of awk like this: 关于速度:如果文件的列在固定位置,则可以使用cut而不是awk如下所示:

SEARCHSTR=$(echo $LINE | cut --output-delimiter="|" -c 15-20,21-26,27-32|tr -d " ")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM