简体   繁体   中英

Linux command to grab lines similar between files

I have one file that has one word per line.

I have a second file that has many words per line.

I would like to go through each line in the first file, and all lines for which it is found in the second file, I would like to copy those lines from the second file into a new third file.

Is there a way to do this simply with Linux command?

Edit: Thanks for the input. But, I should specify better:

The first file is just a list of numbers (one number per line).

463463 43454 33634

The second file is very messy, and I am only looking for that number string to be in lines in any way (not necessary an individual word). So, for instance

ewjleji jejeti ciwlt 463463.52%

would return a hit. I think what was suggested to me does not work in this case (please forgive my having to edit for not being detailed enough)

If n is the number of lines in your first file and m is the number of lines in your second file, then you can solve this problem in O(nm) time in the following way:

cat firstfile | while read word; do
    grep "$word" secondfile >>thirdfile
done

If you need to solve it more efficiently than that, I don't think there are any builtin utilties for that, however.

As for your edit, this method does work the way you describe.

Here is a short script that will do it. it will take 3 command line arguments 1- file with 1 word per line, 2- file with many lines you want to match for each word in file1 and 3- your output file:

#!/bin/bash

## test input and show usage on error
test -n "$1" && test -n "$2" && test -n "$3" || {
    printf "Error: insufficient input, usage: %s file1 file2 file3\n" "${0//*\//}"
    exit 1
}

while read line || test -n "$line" ; do

    grep "$line" "$2" 1>>"$3" 2>/dev/null

done <"$1"

example:

$ cat words.txt
me
you
them

$ cat lines.txt
This line is for me
another line for me
maybe another for me
one for you
another for you
some for them
another for them
here is one that doesn't match any

$ bash ../lines.sh words.txt lines.txt outfile.txt

$ cat outfile.txt
This line is for me
another line for me
maybe another for me
some for them
one for you
another for you
some for them
another for them

(yes I know that me also matches some in the example file, but that's not really the point.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM