简体   繁体   中英

remove specific lines of file using bash

I have to file suppost 1.txt and 2.txt 1.txt has a number in each line indicate the itemID. for example

43
345
65

the second file is csv and has the following pattern userID,itemID,time

I want to remove from second file all lines which their itemID is in the first file for this purpose I do the following

#!/bin/bash

while IFS= read -r var 
do
 paste -sd '|' | xargs -I{} grep -v -E {} 2.txt
done < "1.txt"

I read the first file and create a regular expression, but don't know the argument of egrep to get the second filed(itemID)

You can use sed to add commas to the beginning and end of each id in 1.txt, then just use grep to filter out the resulting strings:

sed 's/^\|$/,/g' 1.txt | grep -vFf- 2.txt
  • -F interprets the patterns as fixed strings, ie not regular expressions
  • -f tells grep to read the patterns from the given file, in this case - , ie standard input

With awk

awk -F "," 'FNR==NR{a[$0]=1;next} a[$2]!=1{print}' 1.txt 2.txt
  • a[$0] = 1 records item id that exists in 1.txt into an array a
  • a[$2] != 1 prints out lines from 2.txt that does not have the item id in array a

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM