简体   繁体   中英

How can I find which lines in a certain file are not started by lines from another file using bash?

I have two text files, A and B:

A:

a start
b stop
c start
e start

B:

b
c

How can I find which lines in A are not started by lines from B using shell(bash...) command. In this case, I want to get this answer:

a start
e start

Can I implement this using a single line of command?

This should do:

sed '/^$/d;s/^/^/' B | grep -vf - A

The sed command will take all non-empty lines (observe the /^$/d command) from the file B and prepend a caret ^ in front of each line (so as to obtain an anchor for grep 's regexp), and spits all this to stdout . Then grep, with the -f option (which means take all patterns from a file, which happens to be stdin here, thanks to the - symbol) and does an invert matching (thanks to the -v option) on file A . Done.

I think this should do it:

sed 's/^/\^/g' B > C.tmp
grep -vEf C.tmp A
rm C.tmp

You can try using a combination of xargs , cat , and grep

Save the first letters of each line into FIRSTLETTERLIST. You can do this with some cat and sed work.

The idea is to take the blacklist and then match it against the interesting file.

cat file1.txt | xargs grep ^[^[$FIRSTLETTERLIST]]

This is untested, so I won't guarantee it will work, but it should point you in the right direction.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM