How can I find which lines in a certain file are not started by lines from another file using bash?

Question

I have two text files, A and B:

A:

a start
b stop
c start
e start

B:

b
c

How can I find which lines in A are not started by lines from B using shell(bash...) command. In this case, I want to get this answer:

a start
e start

Can I implement this using a single line of command?

Answer 1

This should do:

sed '/^$/d;s/^/^/' B | grep -vf - A

The sed command will take all non-empty lines (observe the /^$/d command) from the file B and prepend a caret ^ in front of each line (so as to obtain an anchor for grep 's regexp), and spits all this to stdout . Then grep, with the -f option (which means take all patterns from a file, which happens to be stdin here, thanks to the - symbol) and does an invert matching (thanks to the -v option) on file A . Done.

Answer 2

I think this should do it:

sed 's/^/\^/g' B > C.tmp
grep -vEf C.tmp A
rm C.tmp

Answer 3

You can try using a combination of xargs , cat , and grep

Save the first letters of each line into FIRSTLETTERLIST. You can do this with some cat and sed work.

The idea is to take the blacklist and then match it against the interesting file.

cat file1.txt | xargs grep ^[^[$FIRSTLETTERLIST]]

This is untested, so I won't guarantee it will work, but it should point you in the right direction.

How can I find which lines in a certain file are not started by lines from another file using bash?

Question

3 answers

solution1
3 ACCPTED 2012-12-02 17:58:00

solution2
1 2012-12-02 17:52:44

solution3
0 2012-12-02 17:57:09

How can I find which lines in a certain file are not started by lines from another file using bash?

Question

3 answers

solution1 3 ACCPTED 2012-12-02 17:58:00

solution2 1 2012-12-02 17:52:44

solution3 0 2012-12-02 17:57:09

solution1
3 ACCPTED 2012-12-02 17:58:00

solution2
1 2012-12-02 17:52:44

solution3
0 2012-12-02 17:57:09