简体   繁体   中英

Match strings from two files and append line with matching string from first file to end of line of second file

So this is a bit tricky and I'm having a heck of a time trying to figure it out.

I have two different files, one is in rackdiag format like so:

#file1
rackdiag {
  rack {
    42U;
    description = "1.1.1";
    1: "serverone" [4U];
    5: servertwo [2U];
    7: serverthree\nblah [3U];
  }
  rack {
    42U;
    description = "1.1.2";
    1: servertwoone [4U];
    5: "servertwotwo" [2U];
   }
 }

Etc.

And the other is a list of server names like so:

#file2
serverone.domain.com
servertwo.domain.com
serverthree.domain.com

I'm trying to match strings between the two files and insert the line with the matching string from the first file onto the end of the second file with a couple additions. I want it to end up like this in the second file:

serverone.domain.com #1.1.1 1: "serverone" [4U];
servertwo.domain.com #1.1.1 5: servertwo [2U];
servertwoone.domain.com #1.1.2 1: servertwoone [4U]; 

I managed to get this far:

#!/bin/bash

cat serverlist.txt | while read line;
do
#grep for matching strings and output entire line when match found to $line2 variable
line2=$(grep -w "$line" row01.txt)
echo "$line "#" $line2" 
done > halp.txt
exit

Which outputs this:

servertwo.domain.com #5: servertwo.domain.com [2U];

But I noticed that it's missing some that should match for some reason.

Like, in the actual file I have this line

   33: servername [2U];

And this line in the second file:

servername.blahhosting.com

When I tried running the script the output was only:

servername.blahhosting.com #

Would anybody be able to help me both getting the 1.1.1/1.1.2 etc. to appear in the output and to figure out why it might be missing some of the lines that match?

Thank you very much!

Edit 1:

rackdiag {
   rack {
       42U;
       description = "5.1.1";
       1: servertwoone [4U];
       1: "servertwoone" [4U];
       1: servertwoone\nserveronetwo [4U];
       1: "servertwoone\nserveronetwo" [4U];
       1: servertwo-1\nserverone1 [4U];
       1: "servertwo-2\nserverone2" [4U];
       1: servertwoone-1 [4U];
       1: servertwoone-2 [4U];
       1: servertwoone1 [4U];
       1: servertwoone2 [4U];
       1: servertwoone;
   }
   rack {
       42U;
       description = "5.1.2";
       1: server two one [4U];
       1: servertwoone [4U];
       1: server.two.one [4U];
   }
}

If there is no [2U] etc. and it's blank at the end, that means it's [1U]

In the case of names with \\n, that means that the server has more than one label on the physical case. I think thats it

Your question isn't clear but here's the right approach and a start towards solving your problem:

$ cat tst.awk
NR==FNR {
    if ( $1 == "description" ) {
        desc = $NF
        gsub(/^"|";$/,"",desc)
    }
    else if ( $1 ~ /^[0-9]+:/ ) {
        nmbr = $1
        sub(/^[[:space:]]*[^[:space:]]+[[:space:]]+/,"")

        if ( $NF ~ /\[.*\];$/ ) {
            blob = $NF
            sub(/[^[:space:]]+$/,"")
        }
        else {
            blob = "[1U];"
        }
        sub(/[[:space:]]+$/,"")

        numSrvrs = split($0,srvrs,/\\n/)
        for (srvrNr=1; srvrNr<=numSrvrs; srvrNr++) {
            srvr = srvrs[srvrNr]
            gsub(/^"|"$/,"",srvr)
            srvr2data[srvr] = "#" desc " " nmbr " " $0 " " blob
            printf "TRACE: srvr2data[%s] = <%s>\n", srvr, srvr2data[srvr]
        }
    }
    next
}
{
    srvr = $0
    sub(/\..*/,"",srvr)
    print $0, srvr2data[srvr]
}

When run against your first 2 sample input files:

$ awk -f tst.awk file1 file2
TRACE: srvr2data[serverone] = <#1.1.1 1: "serverone" [4U];>
TRACE: srvr2data[servertwo] = <#1.1.1 5: servertwo [2U];>
TRACE: srvr2data[serverthree] = <#1.1.1 7: serverthree\nblah [3U];>
TRACE: srvr2data[blah] = <#1.1.1 7: serverthree\nblah [3U];>
TRACE: srvr2data[servertwoone] = <#1.1.2 1: servertwoone [4U];>
TRACE: srvr2data[servertwotwo] = <#1.1.2 5: "servertwotwo" [2U];>
serverone.domain.com #1.1.1 1: "serverone" [4U];
servertwo.domain.com #1.1.1 5: servertwo [2U];
serverthree.domain.com #1.1.1 7: serverthree\nblah [3U];

When run using your 3rd input file ( Edit 1 in your question) and no associated "file2" (since you didn't provide one) so all you get is the trace output as the data from the first file is being populated:

$ awk -f tst.awk file3 /dev/null
TRACE: srvr2data[servertwoone] = <#5.1.1 1: servertwoone [4U];>
TRACE: srvr2data[servertwoone] = <#5.1.1 1: "servertwoone" [4U];>
TRACE: srvr2data[servertwoone] = <#5.1.1 1: servertwoone\nserveronetwo [4U];>
TRACE: srvr2data[serveronetwo] = <#5.1.1 1: servertwoone\nserveronetwo [4U];>
TRACE: srvr2data[servertwoone] = <#5.1.1 1: "servertwoone\nserveronetwo" [4U];>
TRACE: srvr2data[serveronetwo] = <#5.1.1 1: "servertwoone\nserveronetwo" [4U];>
TRACE: srvr2data[servertwo-1] = <#5.1.1 1: servertwo-1\nserverone1 [4U];>
TRACE: srvr2data[serverone1] = <#5.1.1 1: servertwo-1\nserverone1 [4U];>
TRACE: srvr2data[servertwo-2] = <#5.1.1 1: "servertwo-2\nserverone2" [4U];>
TRACE: srvr2data[serverone2] = <#5.1.1 1: "servertwo-2\nserverone2" [4U];>
TRACE: srvr2data[servertwoone-1] = <#5.1.1 1: servertwoone-1 [4U];>
TRACE: srvr2data[servertwoone-2] = <#5.1.1 1: servertwoone-2 [4U];>
TRACE: srvr2data[servertwoone1] = <#5.1.1 1: servertwoone1 [4U];>
TRACE: srvr2data[servertwoone2] = <#5.1.1 1: servertwoone2 [4U];>
TRACE: srvr2data[servertwoone;] = <#5.1.1 1: servertwoone; [1U];>
TRACE: srvr2data[server two one] = <#5.1.2 1: server two one [4U];>
TRACE: srvr2data[servertwoone] = <#5.1.2 1: servertwoone [4U];>
TRACE: srvr2data[server.two.one] = <#5.1.2 1: server.two.one [4U];>

You didn't tell us what that "[4U]" field is so I named it blob - obviously change it to whatever it is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM