简体   繁体   English

如何基于文件/ file1(仅)第一列与linux中的file2的匹配信息从file1提取行?

[英]how to extract rows from file1 based on matching information of its/file1 (only)first column with file2 in linux?

I have two files that look like this: 我有两个看起来像这样的文件:

file 1:
HO840F3000336240 HOUSAM129901651 HOUSAF132871174 F 20060607 Yes
HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000336254 HOUSAM129901651 HOUSAF135357862 F 20060724 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
HOUSAM55967108 HOUSAM53557280 HOUSAF53557285 M 20091129 Yes
HOUSAF55969445 HOUSAM55967108 HOUSAF53579684 F 20120103 Yes

file 2:
HO840F3000336251
HO840F3000487279
HOUSAF135761935
HOUSAM55967108

What I would like to do is to extract those rows from File 1 where first column is common with the first column in File 2. So, based on this example, the output should be : 我想做的是从文件1中提取那些行,其中第一列与文件2中的第一列相同。因此,基于此示例,输出应为:

file3:

HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
HOUSAM55967108 HOUSAM53557280 HOUSAF53557285 M 20091129 Yes

any suggestion? 有什么建议吗?

UPDATE: 更新:

This command will create file3 with the desired output. 此命令将创建具有所需输出的file3。 Tested and works: 经过测试和工作:

cat file1 | grep -f file2 > file3

Output: 输出:

HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes

It uses the -f switch in grep which takes a file name with one pattern per line. 它在grep中使用-f开关,该开关采用的文件名每行只有一种模式。 As per man grep : 根据man grep

    -f FILE, --file=FILE
Obtain patterns from FILE, one per line.  The empty file contains zero patterns, 
and therefore enter code here`matches nothing.  (-f is  specified by POSIX.)

The answer is using join command instead of grep. 答案是使用join命令而不是grep。

after sorting both files based on first column: 根据第一列对两个文件进行排序后:

join File1 file2 > file3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM