简体   繁体   English

从 shell 中的文件中删除模式行的快速方法

[英]Fast way to delete pattern lines from a file in shell

I have a file1 with approx 60000 lines, And a file2 with approx 20000 lines.我有一个大约有 60000 行的文件 1,还有一个大约有 20000 行的文件 2。 I need to delete the lines present in file2 from file1.我需要从文件 1 中删除文件 2 中存在的行。 File2 also contains .* to delete the similar pattern from file1. File2 还包含 .* 以从 file1 中删除类似的模式。

file1:文件 1:

ABC DEG
bhdh jdjjd
cdhhd jdjd
ABC hjj

file2:文件2:

ABC.*
cdhhd jdjd

Output should be:输出应该是:

bhdh jdjjd

Right now, I am using the below code.现在,我正在使用以下代码。

while read -r line
do
  sed -i "/${line}/d" $file1
done < "$file2" 

With this code, it's taking around 30 mins to get the output.使用此代码,大约需要 30 分钟才能获得输出。 I really need a better way to delete those lines from file1.我真的需要一种更好的方法来从 file1 中删除这些行。

This is exactly for your task:这正是您的任务:

grep -vf file2 file1

-v will exclude lines of file1 that match any pattern in file2 -v将排除与 file2 中任何模式匹配的 file1 行


Note: Your loop is very slow because you read the patterns file line by line with a bash loop and you execute thousands of sed commands, one for every pattern.注意:您的循环非常慢,因为您使用 bash 循环逐行读取模式文件并执行数千个sed命令,每个模式一个。 See also here some more on why this is a bad practice.另请参阅此处了解为什么这是一种不好的做法。


Note: To replace file1 with the output of the above command:注意:要将 file1 替换为上述命令的输出:

grep -vf file2 file1 > file1.tmp && mv file1.tmp file1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Shell - 如何从具有多种模式的文本文件中删除行? - Shell - How to delete lines from text file with multiple patterns? 如何从文件[shell]中某些特定行的末尾删除\\ n? - How to delete \n from the end of some particular lines in file [shell]? 如何使用Tcl / Expect从文件中删除与模式匹配的行 - How to delete lines matching a pattern from a file using Tcl/Expect 如何从文件中提取特定行并将其附加到Shell脚本中的另一个现有文件中,然后从原始文件中删除? - how to extract specific lines from file and append it to another existing file in shell script and then delete from original? 从与脚本中第二个文件中的前两个字段匹配的文件中删除行 - Delete lines from a file matching first 2 fields from a second file in shell script 从文件中删除范围内的行 - Delete lines in range from file 使用 shell 脚本在指定模式后将多行插入文件 - Insert multiple lines into a file after specified pattern using shell script 将pattern1和pattern2之间的行放入一个单独的文件中,其中包括shell中带有pattern1的行 - Get lines between pattern1 and pattern2 into a seperate file including the line with pattern1 in shell 在Shell脚本中从文件中提取特定模式 - extract particular pattern from a file in shell scripting 从Bash中具有特定模式的文件中获取行 - Get lines from a file with a specific pattern in Bash
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM