简体   繁体   中英

Join two txt files

Using Linux: *.txt files

file1

HOUSAM000001189870 1012212222011212100020102102112011002200111112012220110 ....... 
HOUSAM000001213135 2012012000120120102010201002102111200122201100222201102 .......
HOUSAM000001237057 0012011222010210120120122210000101112000222210102010201 .......
HOUSAM000001239242 2210120111010010100100022001111000010220010102010201022 .......

file2

HOUSAM000001189870     
HOUSAM000001237057

Output file

HOUSAM000001189870 1012212222011212100020102102112011002200111112012220110 .......
HOUSAM000001237057 0012011222010210120120122210000101112000222210102010201 .......
grep -F -f file2 file1 > file3

If you need to match only the first column and you can change the patterns file you can add ^ to the front of each line and treat that file as regex (remove the first -F). So...

file2:

^4046
^4050
^4047

Then:

grep -f file2 file1 > file3

If you can't change the patterns file or if it is generated outside of your control then Serge's answer is the best one.

Use the join command in conjunction with sort :

$ join <(sort 1.txt) <(sort 2.txt)
4046 200344
4047 200122
4050 200001

另一种选择:

sed 's/^/^/;s/$/[[:space:]]' file2 | grep -f - file1 > file3

One way using awk :

 awk 'FNR==NR { array[$1]=$2; next } $1 in array { print $1, array[$1] }' file1.txt file2.txt

Results:

4046 200344
4050 200001
4047 200122

Edit: using real data:

awk 'FNR==NR { array[$1]=$0; next } $1 in array { print array[$1] }' file1.txt file2.txt

Results:

HOUSAM000001189870 1012212222011212100020102102112011002200111112012220110 ....... 
HOUSAM000001237057 0012011222010210120120122210000101112000222210102010201 .......

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM