I am working on extracting service objects (ports/protocols) from a large router configuration file. Using awk, I would like to be able to take unique instances of $5 and print them on one line, with the various values in $7 printed after the unique instance of $5 separated by commas.
Input data:
set resources group port-group ServiceA port '1'
set resources group port-group ServiceA port '2'
set resources group port-group ServiceA port '3'
set resources group port-group ServiceB port '10'
set resources group port-group ServiceA port '1'
set resources group port-group ServiceA port '2'
set resources group port-group ServiceA port '3'
set resources group port-group ServiceB port '10'
set resources group port-group ServiceB port '20'
set resources group port-group ServiceC port '30'
set resources group port-group ServiceC port '40'
set resources group port-group ServiceD port '50'
set resources group port-group ServiceD port '5050'
set resources group port-group ServiceD port '60'
set resources group port-group ServiceD port '65'
set resources group port-group ServiceD port '66'
set resources group port-group ServiceD port '89'
Desired Output:
set resources group port-group ServiceA port 1, 2, 3
set resources group port-group ServiceB port 10, 20
set resources group port-group ServiceC port 30, 40
set resources group port-group ServiceD port 50, 5050, 60, 65, 66, 89
So far my attempts at making awk statements have not been fruitful.
What I've tried (it's part of a script so that's why there are CR.)
awk '{
gsub(/[:\47]/,"")}; i=!seen[$5]++; {print i,$7 } ' inputfile.txt
This gives me the following output:
set resources group port-group ServiceA port 1
1 1
0 2
0 3
set resources group port-group ServiceB port 8
1 8
0 1
0 2
0 3
0 8
0 3
set resources group port-group ServiceC port 2
1 2
0 3
set resources group port-group ServiceD port 8
1 8
0 5050
0 3
0 83
0 1
0 2
0 990
0 3000
0 3001
0 3002
0 3003
I'm assuming I will have to use a multidimensional array with a for loop to accomplish this, but I'm stuck. Any help is appreciated!
awk solution:
awk '!a[$5]{a[$5]=$0; uniq[$5,$7]=$7}{ if ($5 in a && uniq[$5,$7]!=$7){
a[$5]=a[$5]","$7; uniq[$5,$7]=$7}}END{for(i in a) print a[i]}' inputfile.txt
The output:
set resources group port-group ServiceA port '1','2','3'
set resources group port-group ServiceB port '10','20'
set resources group port-group ServiceC port '30','40'
set resources group port-group ServiceD port '50','5050','60','65','66','89'
!a[$5]{a[$5]=$0; uniq[$5,$7]=$7}
!a[$5]{a[$5]=$0; uniq[$5,$7]=$7}
- capturing line at the first occurrence of the unique 5th field value
if($5 in a && uniq[$5,$7]!=$7)
- check for duplacate values for the same Service...
uniq
array if for accumulating unique bindings of 5th and 7th fields
a[$5]=a[$5]","$7
- add next unique value to the end of the crucial line
To get values without single quotes use the following approach:
group_port_values.awk script:
#!/bin/awk -f
BEGIN { FS="[ ']" }
!a[$5] {
a[$5] = $0;
uniq[$5,$8] = $8
}
{
if ($5 in a && uniq[$5,$8] != $8) {
a[$5] = a[$5]", "$8;
uniq[$5,$8] = $8
}
}
END {
for (i in a) {
gsub(/\047/,"",a[i]);
print a[i]
}
}
Usage :
awk -f group_port_values.awk inputfile.txt
The output:
set resources group port-group ServiceA port 1, 2, 3
set resources group port-group ServiceB port 10, 20
set resources group port-group ServiceC port 30, 40
set resources group port-group ServiceD port 50, 5050, 60, 65, 66, 89
TGIF! Here's one for GNU awk using 2D arrays and bad coding habits (:
$ awk '
++a[$5 , $7]==1 { # if not seen before
b[$5][++c[$5]]=$7 } # hash it to b[key][index]
END{
for(i in b) { # for all keys
for(j=1;j<=c[i];j++) # and all its indexes
d=(j==1?"":d",")b[i][j] # gather buffer
sub($5,i) # use the last known $0
sub($NF,d) # and replace key and buffer to it
print } # output
}' file
set resources group port-group ServiceA port '1','2','3'
set resources group port-group ServiceB port '10','20'
set resources group port-group ServiceC port '30','40'
set resources group port-group ServiceD port '50','5050','60','65','66','89'
$ cat tst.awk
BEGIN { OFS=", " }
{ gsub(/\047/,""); pfx=$1 FS $2 FS $3 FS $4 }
$5 != prev { prt(prev); prev=$5 }
!seen[$7]++ { ports[++numPorts] = $7 }
END { prt(prev) }
function prt(sg) {
if ( sg != "" ) {
printf "%s %s ", pfx, sg
for (portNr=1; portNr<=numPorts; portNr++) {
printf "%s%s", ports[portNr], (portNr<numPorts ? OFS : ORS)
}
delete ports
delete seen
numPorts = 0
}
}
$ sort file | awk -f tst.awk
set resources group port-group ServiceA 1, 2, 3
set resources group port-group ServiceB 10, 20
set resources group port-group ServiceC 30, 40
set resources group port-group ServiceD 50, 5050, 60, 65, 66, 89
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.