简体   繁体   English

将txt文件中的每一行与每一行中的另一个文本文件进行比较

[英]Comparing each line in txt file to another text file in each line

i'm having some difficulties about shell script these days. 这些天,我在shell脚本方面遇到了一些困难。 Done some test and still can't make it work. 做了一些测试,仍然无法正常工作。 I'm trying to compare each line in a text file to another text file. 我正在尝试将一个文本文件中的每一行与另一个文本文件进行比较。 The idea is to see if some line is not in the file2. 想法是查看文件2中是否没有行。

Could someone see what is wrong on my script ? 有人可以看到我的脚本有什么问题吗?

Thanks!! 谢谢!!

#!/bin/bash
  FILE1='/filePath/file.txt'
  FILE2='/filePath/file2.txt'
  for line in $FILE1 
  do
    for line2 in $FILE2
    do
        if  $line != $line2
            then
            echo -e /> diffsScr.txt
        fi
    done 
  done
awk 'FNR==NR{f[$0]+=1; next} !($0 in f)' input1 input2

This reads through the file input1 and builds an array. 这将读取文件input1并构建一个数组。 Then it goes through input2 and prints each line that did not appear in input1 . 然后,它通过input2并打印未出现在input1每一行。 If you want to add line numbers: 如果要添加行号:

awk 'FNR==NR{f[$0]+=1; next} !($0 in f) { print FNR, $0}' input1 input2

One big advantage of this approach is that it scales well. 这种方法的一大优势是它可以很好地扩展。 Your approach is O(n*m) where n and m are the number of lines in the files, but pre-reading into a array like this gives you a solution that is O(n+m). 您的方法是O(n * m),其中n和m是文件中的行数,但是像这样预先读取到数组中可以得到O(n + m)的解决方案。 In other words, you only read through each file once. 换句话说,您只阅读一次每个文件。

您可以使用以下命令实现此结果:

fgrep -v -x -f '/filePath/file2.txt' '/filePath/file1.txt'

If your files are sorted, you can use comm to compare files. 如果文件已排序,则可以使用comm比较文件。

comm -23 file1 file2  

Description of comm: 通讯说明:

   Compare sorted files FILE1 and FILE2 line by line.

   With  no  options,  produce  three-column  output.  Column one contains
   lines unique to FILE1, column two contains lines unique to  FILE2,  and
   column three contains lines common to both files.

   -1     suppress column 1 (lines unique to FILE1)

   -2     suppress column 2 (lines unique to FILE2)

   -3     suppress column 3 (lines that appear in both files)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM