简体   繁体   English

如何使用 grep 将一个文件中的子字符串匹配到另一个文件?

[英]How to match substring from one file to another file using grep?

I have two files.我有两个文件。 File1 with bunch of email address. File1 带有一堆电子邮件地址。 File2 with list of domains.带有域列表的 File2。

I want to find all the email address matching the domains (also the non-matching ones)我想找到所有与域匹配的电子邮件地址(也包括不匹配的)

If some one please let me know how can we do this using 'grep' from terminal.如果有人请让我知道我们如何使用终端中的“grep”来做到这一点。

File1.csv
abc@gmail.com
abc@fmail.com
abc@fb.com
abc@hotmail.com
abc@outlook.com
abc@live.com

File2
hotmail.com
live.com
fb.com

The output should be (and non-matching as well)
abc@fb.com
abc@hotmail.com
abc@live.com

Please consider the email file is too big and contains 2M emails to compare against 6k domains.

您可以使用 -f 从文件中读取模式:

grep -f File2 File1.csv

In your comment, you are trying to match a fixed pattern of: @[anydomainNameInFile2.txt]在您的评论中,您尝试匹配以下固定模式: @[anydomainNameInFile2.txt]

in this case, you might need to add @ at the beginning of each line in File2 so you can use it as a fixed pattern.在这种情况下,您可能需要在File2的每一行的开头添加@以便您可以将其用作固定模式。

you can do it by the following:您可以通过以下方式做到这一点:

  1. add @ at the beginning of each line in file2.txt using sed command.使用sed命令在 file2.txt 中每一行的开头添加@

     $ sed 's/^/@/' file2.txt > new-file.txt

don't worry that won't miss with your main file, you are saying it's about 2M field, we are saving the output to another file named new-file.txt不用担心你的主文件不会丢失,你说它大约 2M 字段,我们将输出保存到另一个名为new-file.txt文件

  1. run the -f option grep command using the new-file.txt file as the following:使用new-file.txt文件运行 -f 选项 grep 命令,如下所示:

     $ grep -f newfile.txt File1.csv

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM