文件1到文件2中的单词不匹配

Question

I have two files - file1 & file2. 我有两个文件-file1和file2。 file1 contains (only words) says- file1包含（仅单词）说-

ABC
YUI
GHJ
I8O

.................. .....................

file2 contains many para. file2包含许多段。

dfghjo ABC kll njjgg bla bla 
GHJ njhjckhv chasjvackvh ..
ihbjhi hbhibb jh jbiibi

................... ...................

I am using below command to get the matching lines which contains word from file1 in file2 我正在使用以下命令来获取包含file2中的file1中的单词的匹配行

 grep -Ff file1 file2
(Gives output of lines where words of file1 found in file2)

I also need the words which doesn't match/found in file 2 and unable to find Un-matching word. 我还需要在文件2中不匹配/找不到并且无法找到不匹配单词的单词。

Can anyone help in getting below output 任何人都可以帮助低于输出

YUI
I8O

i am looking one liner command (via grep,awk,sed), as i am using pssh command and can't use while,for loop 我正在寻找一个衬板命令（通过grep，awk，sed），因为我正在使用pssh命令，并且不能使用while，for循环

Answer 1

You can print only the matched parts with -o . 您只能使用-o打印匹配的部分。

$ grep -oFf file1 file2
ABC
GHJ

Use that output as a list of patterns for a search in file1. 将该输出用作文件1中搜索的模式列表。 Process substitution <(cmd) simulates a file containing the output of cmd . 进程替换<(cmd)模拟包含cmd输出的文件。 With -v you can print lines that did not match. 使用-v可以打印不匹配的行。 If file1 contains two lines such that one line is a substring of another line you may want to add -x (only match whole lines) to prevent false positives. 如果file1包含两行，使得一行是另一行的子字符串，则可能需要添加-x （仅匹配整行）以防止误报。

$ grep -vxFf <(grep -oFf file1 file2) file1
YUI
I8O

Answer 2

Using Perl - both matched/non-matched in same one-liner 使用Perl-在同一单行中匹配/不匹配

$ cat sinw.txt
ABC
YUI
GHJ
I8O

$ cat sin_in.txt
dfghjo ABC kll njjgg bla bla
GHJ njhjckhv chasjvackvh ..
ihbjhi hbhibb jh jbiibi

$ perl -lne '
    BEGIN { %x=map{chomp;$_=>1} qx(cat sinw.txt); $w="\\b".join("\|",keys %x)."\\b"} 
    print "$&" and delete($x{$&}) if /$w/ ; 
    END { print "\nnon-matched\n".join("\n", keys %x) } 
' sin_in.txt

ABC
GHJ

non-matched
I8O
YUI

$

Getting only the non-matched 仅获取不匹配的

$ perl -lne ' 
    BEGIN { 
        %x = map { chomp; $_=>1 } qx(cat sinw.txt); 
        $w = "\\b" . join("\|",keys %x) . "\\b" 
    } 
    delete($x{$&}) if /$w/;
    END { print "\nnon-matched\n".join("\n", keys %x) } 
' sin_in.txt

non-matched
I8O
YUI

$

Note that even a single use of $& variable used to be very expensive for the whole program, in Perl versions prior to 5.20 . 请注意，在5.20之前的 Perl版本中，即使单次使用$＆变量对于整个程序来说也非常昂贵。

Answer 3

Assuming your "words" in file1 are in more than 1 line : 假设您在file1中的“单词”多于1行：

  while read line 
  do 
    for word in $line  
    do 
       if ! grep -q $word file2
         then echo $word not found 
       fi 
    done 
  done < file1

Answer 4

For Un-matching words , here's one GNU awk solution: 对于不匹配的单词 ，这是一个GNU awk解决方案：

awk 'NR==FNR{a[$0];next} !($1 in a)' RS='[ \n]' file2 file1
YUI
I8O

Or !($0 in a) , it's the same. 或!($0 in a)一样。 Since I set RS='[ \\n]' , every space as line separator too. 由于我将RS='[ \\n]' ，所以每个空格也都作为行分隔符。

And note that I read file2 first, and then file1. 并请注意，我先读取file2，然后读取file1。

If file2 could be empty, you should change NR==FNR to different file checking methods, like ARGIND==1 for GNU awk, or FILENAME=="file2" , or FILENAME==ARGV[1] etc. 如果file2为空，则应将NR==FNR更改为其他文件检查方法，例如GNU awk的ARGIND==1或FILENAME=="file2"或FILENAME==ARGV[1]等。

Same mechanism for only the matched one too: 相同的机制也只适用于匹配的机制：

awk 'NR==FNR{a[$0];next} $0 in a' RS='[ \n]' file2 file1
ABC
GHJ

文件1到文件2中的单词不匹配

问题描述

4 个解决方案

解决方案1
1 2019-02-19 09:45:32

解决方案2
1 2019-02-20 12:43:46

解决方案3
0 2019-02-19 09:09:14

解决方案4
0 2019-02-19 09:24:51

文件1到文件2中的单词不匹配

问题描述

4 个解决方案

解决方案1 1 2019-02-19 09:45:32

解决方案2 1 2019-02-20 12:43:46

解决方案3 0 2019-02-19 09:09:14

解决方案4 0 2019-02-19 09:24:51

解决方案1
1 2019-02-19 09:45:32

解决方案2
1 2019-02-20 12:43:46

解决方案3
0 2019-02-19 09:09:14

解决方案4
0 2019-02-19 09:24:51