繁体   English   中英

在unix中的两个固定格式文件中查找字段值 - 不起作用

[英]lookup field values in two fixed format file in unix - not working

我有2个固定长度文件输入#1和输入#2。 我想根据两个文件中位置37-50的值来匹配行(pos 37-50在两个文件中都有相同的值)。

如果找到任何匹配记录,则根据输入文件#1(位置99直到行尾)的公司代码和发票编号剪切该值。

剪切字符串(来自输入#1)需要附加在记录/行的末尾。

下面是我尝试的代码(不工作)以及输入文件和所需的输出。 请提供您的建议。

码:

awk '
NR==FNR && NF>1 {
    v=substr($0,37,14);
#print substr($0,37,14)
    next
}
NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
    sub(/Company Code/,"",$0);
    sub(/Invoice Number/,"",$0);
    a[v]=$0;
print $0
    next
}
(substr($0,37,14) in a) {
    print $0 a[substr($0,99)]
}' Input1.txt input2.txt input3.txt

结束代码

输入#1以一些空格开头的 Start

         612  1111111111201402120000       2     1  111  211 Due Date                             20140101                           
         612  1111111111201402120000       2     1  111  311 Company Code                         227                                
         612  1111111111201402120000       2     1  111  411 Item Code                            12                                 
         612  1111111111201402120000       2     1  111  511 Invoice Number                       2014010                            
         612  1111111111201402120000       2     2  111  611 Company Code                         214                                
         612  1111111111201402120000       2     2  111  711 Item Code                            20                                 
         612  1111111111201402120000       2     2  111  811 Invoice Number                       3014010                            
         612  1111111111201402120000       2     3  111  911 Due Date                             20140101                           
         612  1111111111201402120000       2     3  111  111 Invoice Number                       40140101                           
         612  1111111111201402120000       2     3  111  121 user code                            15563263636                        
         612  1111111111201402120000       2     3  111  131 Amount Due                           100000                             
         612  111111111120140212000078978982123444  111  141 Due Date                             20140101                             
         612  111111111120140212000078978982123444  111  151 Invoice Number                       50140101                             
         612  111111111120140212000078978982123444  111  161 Amount Due                          008000                             

输入#1结束

输入#2开始输入2

         510       77432201111010000       2     1        1ChK          100111000001    121000248           123456789            20111101.510.77432.20001C                         
         510       77432201111010000       2     1        2INv                                                                   20111101.510.77432.20001D                         
         510       77432201111010000       2     1        3INv                                                                   20111101.510.77432.20002D                         
         510       77432201111010000       2     1        4INv                                                                   20111101.510.77432.20003D                         
         510       77432201111010000       2     1        5INv                                                                   20111101.510.77432.20004D                         
         510       77432201111010000       2     2        1ChK          200111000002    121000248           123456789            20111101.510.77432.20002C                         
         510       77432201111010000       2     2        2INv                                                                   20111101.510.77432.20005D                         
         510       77432201111010000       2     2        3INv                                                                   20111101.510.77432.20006D                         
         510       77432201111010000       2     2        4INv                                                                   20111101.510.77432.20007D                         
         510       77432201111010000       2     2        5INv                                                                   20111101.510.77432.20008D                         
         510       77432201111010000       2     3        1ChK          300111000003    121000248           123456789            20111101.510.77432.20003C                         
         510       77432201111010000       2     3        2INv                                                                   20111101.510.77432.20009D                         
         510       77432201111010000       2     3        3INv                                                                   20111101.510.77432.20010D                         
         510       77432201111010000       2     3        4INv                                                                   20111101.510.77432.20011D                         
         510       77432201111010000       2     6        1ChK          600111000006    121000248           123456789            20111101.510.77432.20006C                         
         510       77432201111010000       2     6        2INv                                                                   20111101.510.77432.20021D                         
         510       77432201111010000       2     6        3INv                                                                   20111101.510.77432.20022D                         
         510       77432201111010000       2     6        4INv                                                                   20111101.510.77432.20023D                         
         510       77432201111010000       2     6        5INv                                                                   20111101.510.77432.20024D                         

输入#2结束

期望的outout期望的输出

         510       77432201111010000       2     1        1ChK          100111000001    121000248           123456789            20111101.510.77432.20001C   2272014010 (company & Inv # from input 1)                     
         510       77432201111010000       2     1        2INv                                                                   20111101.510.77432.20001D   2272014010                                            
         510       77432201111010000       2     1        3INv                                                                   20111101.510.77432.20002D   2272014010                                            
         510       77432201111010000       2     1        4INv                                                                   20111101.510.77432.20003D   (company & Inv # from input 1)                      
         510       77432201111010000       2     1        5INv                                                                   20111101.510.77432.20004D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        1ChK          200111000002    121000248           123456789            20111101.510.77432.20002C   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        2INv                                                                   20111101.510.77432.20005D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        3INv                                                                   20111101.510.77432.20006D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        4INv                                                                   20111101.510.77432.20007D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        5INv                                                                   20111101.510.77432.20008D   (company & Inv # from input 1)                      
         510       77432201111010000       2     3        1ChK          300111000003    121000248           123456789            20111101.510.77432.20003C   (company & Inv # from input 1)                      
         510       77432201111010000       2     6        1ChK          600111000006    121000248           123456789            20111101.510.77432.20006C   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        2INv                                                                   20111101.510.77432.20021D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        3INv                                                                   20111101.510.77432.20022D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        4INv                                                                   20111101.510.77432.20023D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        5INv                                                                   20111101.510.77432.20024D   <there is no matching record in input 1, this will be blank>                      

尝试这样的事情(未经测试)

awk '
NR==FNR && /Company Code/ {
    cc[$3,$4] = $NF;
    next;
}
NR==FNR && /Invoice Number/ {
    inv[$3,$4] = $NF;
    next;
}
NR==FNR {next}
{print $0 FS cc[$3,$4] inv[$3,$4]}' input1 input2

你的awk代码有几个问题。

让我们一步一步地完成它们:

  1. NR==FNR && NF>1 {...;next}NR==FNR && ... - > next将阻止对除第一个记录之外的所有记录执行第二个动作

  2. NR==FNR && ( /Company Code/ OR /Invoice Number/ ) { - > OR不是有效的awk语句, 逻辑OR是使用||完成的 (就像你使用&&而不是AND )。

  3. print $0 a[substr($0,99)] - > a[substr($0,99)]从第二个输入文件中记录的第99个位置获取所有内容,以查找数组,但是你的密钥来自37 -50。

我们可以用以下方式修复它们:

  1. 摆脱第一个操作中的next一个操作 ,并将第三个操作限制为第二个输入文件中的记录

  2. ||代替OR

  3. 使用substr($0,37,14)作为在asubstr(...,99)查找结果的键。

这会产生以下代码(删除诊断print命令和未使用的第三个输入文件):

awk '
NR==FNR && NF>1 {
    v=substr($0,37,14);
}
NR==FNR && ( /Company Code/ || /Invoice Number/ ) {
    sub(/Company Code/,"",$0);
    sub(/Invoice Number/,"",$0);
    a[v]=$0;
    next
}
NR!=FNR && (substr($0,37,14) in a) {
    print $0 substr(a[substr($0,37,14)],99)
}' input1.txt input2.txt

由于您的输入已关闭,我无法重现您想要的输出,但我希望您可以从这里找到它。

此外,我将您的代码缩短到以下版本,执行我认为您希望它从给定的输入开始执行的操作:

awk '
{key=substr($0,37,14)}
NR==FNR{
  if(/Company Code/||/Invoice Number/)array[key]=substr($0,98)
  next
}
(key in array){print $0,array[key]}
' input1.txt input2.txt

如果您需要调整/解释,请随时发表评论。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM