using awk match 1 column to another in two files and then do date subtraction for the matching records

Question

Using awk match 1 column to another in two files and then do date subtraction(in days) for the matching records.

Lets suppose i have two files

file1:

123,2-jul-2016
124,2-jul-2018

file2:

123,2-jul-2015
124,2-jul-2017

If matched then give me output as

123,366
124,366

Thanks for the help

Answer 1

Could you please try following.

awk '
BEGIN{
  FS=OFS=","
  num=split("jan,feb,mar,apr,may,jun,jul,aug,oct,nov,dec",month,",")
  for(i=1;i<=num;i++){
    daymonth[month[i]]=i}
}
FNR==NR{
  a[$1]=$2
  next
}
($1 in a){
  split($2,array2,"-")
  split(a[$1],array1,"-")
  print $1,(mktime(sprintf("%d %d %d 0 0 0 0",array2[3],daymonth[array2[2]],array2[1]))-\
            mktime(sprintf("%d %d %d 0 0 0 0",array1[3],daymonth[array1[2]],array1[1])))\
                  /86400
}'  Input_file2   Input_file1

Answer 2

With GNU awk for time functions:

$ cat tst.awk
BEGIN {
    FS  = "[,-]"
    OFS = ","
}
{
    mthNr = (index("janfebmaraprmayjunjulaugsepoctnovdec",$3)+2)/3
    date  = sprintf("%04d %02d %02d 00 00 00", $4, mthNr, $2)
    secs  = mktime(date)
}
NR==FNR {
    end[$1] = secs
    next
}
{
    print $1, int((end[$1] - secs) / (24*60*60))
}

$ awk -f tst.awk file1 file2
123,366
124,365

The expected output in your question was wrong as it didn't account for the leap day in feb 2016 or for 2015 it's assuming that, for example, the difference between 3 and 4 is 2 instead of 1.

using awk match 1 column to another in two files and then do date subtraction for the matching records

Question

2 answers

solution1
0 2018-07-29 17:09:32

solution2
0 ACCPTED 2018-07-30 01:17:33

using awk match 1 column to another in two files and then do date subtraction for the matching records

Question

2 answers

solution1 0 2018-07-29 17:09:32

solution2 0 ACCPTED 2018-07-30 01:17:33

solution1
0 2018-07-29 17:09:32

solution2
0 ACCPTED 2018-07-30 01:17:33