[英]how to extract rows from file1 based on matching information of its/file1 (only)first column with file2 in linux?
I have two files that look like this: 我有两个看起来像这样的文件:
file 1:
HO840F3000336240 HOUSAM129901651 HOUSAF132871174 F 20060607 Yes
HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000336254 HOUSAM129901651 HOUSAF135357862 F 20060724 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
HOUSAM55967108 HOUSAM53557280 HOUSAF53557285 M 20091129 Yes
HOUSAF55969445 HOUSAM55967108 HOUSAF53579684 F 20120103 Yes
file 2:
HO840F3000336251
HO840F3000487279
HOUSAF135761935
HOUSAM55967108
What I would like to do is to extract those rows from File 1 where first column is common with the first column in File 2. So, based on this example, the output should be : 我想做的是从文件1中提取那些行,其中第一列与文件2中的第一列相同。因此,基于此示例,输出应为:
file3:
HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
HOUSAM55967108 HOUSAM53557280 HOUSAF53557285 M 20091129 Yes
any suggestion? 有什么建议吗?
UPDATE: 更新:
This command will create file3 with the desired output. 此命令将创建具有所需输出的file3。 Tested and works: 经过测试和工作:
cat file1 | grep -f file2 > file3
Output: 输出:
HO840F3000336251 HOUSAM129800008 HOUSAF135774690 F 20060718 Yes
HO840F3000487279 HOUSAM131520543 HOUSAF135761935 F 20061226 Yes
It uses the -f
switch in grep which takes a file name with one pattern per line. 它在grep中使用-f
开关,该开关采用的文件名每行只有一种模式。 As per man grep
: 根据man grep
:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns,
and therefore enter code here`matches nothing. (-f is specified by POSIX.)
The answer is using join command instead of grep. 答案是使用join命令而不是grep。
after sorting both files based on first column: 根据第一列对两个文件进行排序后:
join File1 file2 > file3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.