I have a file1 with approx 60000 lines, And a file2 with approx 20000 lines. I need to delete the lines present in file2 from file1. File2 also contains .* to delete the similar pattern from file1.
file1:
ABC DEG
bhdh jdjjd
cdhhd jdjd
ABC hjj
file2:
ABC.*
cdhhd jdjd
Output should be:
bhdh jdjjd
Right now, I am using the below code.
while read -r line
do
sed -i "/${line}/d" $file1
done < "$file2"
With this code, it's taking around 30 mins to get the output. I really need a better way to delete those lines from file1.
This is exactly for your task:
grep -vf file2 file1
-v
will exclude lines of file1 that match any pattern in file2
Note: Your loop is very slow because you read the patterns file line by line with a bash loop and you execute thousands of sed
commands, one for every pattern. See also here some more on why this is a bad practice.
Note: To replace file1 with the output of the above command:
grep -vf file2 file1 > file1.tmp && mv file1.tmp file1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.