简体   繁体   中英

How to Compare two files using AWK or GREP

So I have two questions, I have two files where I am trying to sort and filter. In these two files they each have two columns wherein file1 there is IP and Port and in file2 there is domain and IP.

file1:

Address,Port
1.2.3.4,8080
4.5.6.7,80
6.7.8.9,443

file2:

Domain,IP
google.com,1.2.3.4
google.fe,6.7.8.9
admin.ko,3.2.4.5

So the first question: I want to find IPs in file1 that don't match any IP's located in file2.

I have tried using awk, and here is what I used:

awk -F',' FNR==NR{ a[$2]; next } !($1 in a)' file2 file1

So I really don't understand awk really well, so could someone also assist me in understanding each section of that awk command you provide :)

Desired output:

Address,Port,Status
1.2.3.4,8080,Present
4.5.6.7,80,Not-Present
6.7.8.9,443,Present

The next question, I have no idea how to do it so kindly assist.

Second question: so I want to list the same desired Output as the first one but this time I want to add the domain column.

Desired output:

Address,Port,Domain,Status
1.2.3.4,8080,google.com,Present
4.5.6.7,80,NULL,Not-Present
6.7.8.9,443,google.fe,Present

Thank you in advance.

Following is the complete explanation of your mentioned code, please go through it.

awk -F',' '    ##Setting awk program here and setting comma as field separator for all lines here.
FNR==NR{       ##Checking condition if FNR==NR which will be TRUE when first Input_file named file2 is being read.
  a[$2]        ##Creating an array named a with index $2 of current line here.
  next         ##next will skip all further statements from here.
}
!($1 in a)     ##Checking condition if $1 is NOT present from Input_file1 then print that line from Input_file1.
' file2 file1  ##Mentioning Input_file names here.


For your 2nd question you could try following code.

awk '
BEGIN{
  FS=OFS=","
}
FNR==1{
  if(++count==1){
    val=$0
  }
  if(++count==2){
    print val,$1,"Status"
  }
  next
}
FNR==NR{
  a[$2]=$1
  next
}
{
  print $0,$1 in a?"Present":"Not-Present"
}
' file2  file1

1) This one is very close to your original try... We just add column 3 differently based upon FNR.

Notes:

  • The header never contains an IP, so storing it in a[] makes no real difference.
  • The 1 at the end of the script is important, the side effect prints to stdout.
    awk -F, -v OFS=, '
        FNR == NR { a[$2] = ""; next }
        FNR == 1  { $3 = "Status" }
        FNR != 1  { $3 = (($1 in a) ? "Present" : "Not-Present") }
        1' file2 file1

2) We leverage a[] to store the domain in this case, and modify columns 3 and 4 this time accordingly.

    awk -F, -v OFS=, '
        FNR == NR { a[$2] = $1; next }
        FNR == 1  { $3 = "Domain"; $4 = "Status" }
        FNR != 1  {
            if ($1 in a) {
                $3 = a[$1];  $4 = "Present"
            } else {
                $3 = "NULL"; $4 = "Not-Present"
            }
        }
        1' file2 file1
awk -F,  '
          BEGIN{OFS=","}/*Set output file descriptor*/
          FNR==1{next} /*Skip in case of header*/
          NR==FNR{a[$2]=$1;next}/*Store ips in associative array from file2 */
          {/* Process records in file1*/
           if(a[$1])
                   {$3="present"}
           else
                   {$3="not present"}
          print 
          }' file2 file1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM