比较AWK中的两个文件

Question

I have two .txt files and I want to check if the contents of one file are present in the other or not.我有两个.txt文件，我想检查一个文件的内容是否存在于另一个文件中。 My Book1.txt contents are:我的Book1.txt内容是：

PATX248
PATX216
PATX203
PATX219B
PATX212
PATX248
PATX211
PATX190
PATX222
PATX241
B8025
B1003
B8063
B8032
C0999
C1035
B1011

My InventorySheet2finaloutput.txt is:我的InventorySheet2finaloutput.txt是：

B8061P3 366-L4/26/2017 1
PATX-148 P3 4
 1003P4 M#1N-L1/19/2017
B1011P5 330-L2/23/2017 1
B8032P3 336-L3/10/2017 1
B1011P5 329-L2/14/2017 1
PATX-60P5 279-L2/8/2017 1
PATX-70 P3 5
B1573P6 1R-R8/10/2017 1
B8025 P4 5
B8025 P5 1
 1061P3 372-R4/26/2017
 2078 P4M#1RR-R8/25/2017
C0999 P5 4
B8078 P4M#1N-R8/25/2017 2
C-1008 P4 1
PATX-55 P4 4
B1003P5 325-R3/3/2017 1
PATX-45P4 266-L2/14/2017 1
B8032P4 384-R4/26/2017 1
C-1035 P3 1
B8032P3 340-R3/17/2017 1

Output:输出：

B1003P5 325-R3/3/2017 1
B8032P3 336-L3/10/2017 1
B8032P4 384-R4/26/2017 1
B8032P3 340-R3/17/2017 1
C0999 P5 4
C-1035 P3 1
B1011P5 330-L2/23/2017 1
B1011P5 329-L2/14/2017 1

I have used all the solutions I could search on google, they all are getting executed but no result is being printed.我已经使用了我可以在谷歌上搜索的所有解决方案，它们都在执行，但没有打印任何结果。 The solutions that I tried are:我尝试的解决方案是：

grep -v -F -x -f Book1.txt InventorySheet2finaloutput.txt (tried grep all forms of flag) grep -v -F -x -f Book1.txt InventorySheet2finaloutput.txt （尝试grep所有形式的标志）
awk 'NR == FNR {Book1[$0]++; next} ($0 in Book1)' Book1.txt InventorySheet2finaloutput.txt
awk 'NR==FNR{a[$1];next}$1 in a{print $1}' Book1.txt InventorySheet2finaloutput.txt
grep "$(cat Book1.txt)" InventorySheet2finaloutput.txt

I want to find if the contents of Book1 are present in InventorySheet or not.我想查找Book1的内容是否存在于InventorySheet 。

Answer 1

Oh, I get it now: the contents of Book1 are supposed to be the prefix (with, it seems, an optional hyphen) of the lines of InventorySheet.哦，我现在明白了：Book1 的内容应该是 InventorySheet 行的前缀（似乎带有一个可选的连字符）。 So, given B1003 in Book1 we match the B1003P5 line in InventorySheet.因此，鉴于 Book1 中的B1003 ，我们匹配 InventorySheet 中的B1003P5行。 Or C1035 matches C-1035 .或者C1035匹配C-1035 。

grep -Ef <(sed -E 's/^/^/; s/([[:alpha:]])([[:digit:]])/\1-?\2/' Book1) InventorySheet

That uses sed to generate the extended regular expressions from the Book1 file, and the process substitution allows up to hand grep a "pseudo-filename".它使用 sed 从 Book1 文件生成扩展的正则表达式，并且过程替换允许手动 grep 一个“伪文件名”。

Given your sample files, this outputs鉴于您的示例文件，这输出

B1011P5 330-L2/23/2017 1
B8032P3 336-L3/10/2017 1
B1011P5 329-L2/14/2017 1
B8025 P4 5
B8025 P5 1
C0999 P5 4
B1003P5 325-R3/3/2017 1
B8032P4 384-R4/26/2017 1
C-1035 P3 1
B8032P3 340-R3/17/2017 1

In awk, this would be在 awk 中，这将是

awk '
    NR==FNR {book[$1]; next}
    { 
        key=$1
        gsub(/-/, "", key)
        for (b in book) 
            if (key ~ "^"b) {print; break}
    }
' Book1 InventorySheet

Answer 2

Best I can tell this does what you say want and the posted expected output in your question is wrong:最好我能说这符合您所说的要求，并且您问题中发布的预期输出是错误的：

$ cat tst.awk
{
    key=$1
    gsub(/[^[:alnum:]]/,"",key)
    match(key,/^[[:upper:]]+[[:digit:]]+/)
    key = substr(key,RSTART,RLENGTH)
}
NR==FNR { keys[key]; next }
key in keys

$ awk -f tst.awk Book1.txt Inventory.txt
B1011P5 330-L2/23/2017 1
B8032P3 336-L3/10/2017 1
B1011P5 329-L2/14/2017 1
B8025 P4 5
B8025 P5 1
C0999 P5 4
B1003P5 325-R3/3/2017 1
B8032P4 384-R4/26/2017 1
C-1035 P3 1
B8032P3 340-R3/17/2017 1

比较AWK中的两个文件

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-09-28 19:30:58

解决方案2
0 2017-09-28 22:22:31

比较AWK中的两个文件

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-09-28 19:30:58

解决方案2 0 2017-09-28 22:22:31

解决方案1
1 已采纳 2017-09-28 19:30:58

解决方案2
0 2017-09-28 22:22:31