简体   繁体   中英

how to perform a key field lookup in a file using bash or awk?

I'm bit of a newbie to shell scripting and awk. Could anyone suggest a more efficient and elegant solution to what I'm doing below to perform a key lookup between two files ?

Two input files:

File 1 - Contains a single column key field (server-metricname-minute) :

key_column  
server026-AckDelayAverage-00:01:00  
server026-AckDelayMax-00:01:00  
server026-AckSent-00:01:00  
server026-DigEnvValidationLatestTime-00:01:00  
server026-DigEnvValidationTimeAverage-00:01:00

File 2 - Comma separated containing the key field and number of other fields

key_column,host,date,minute,metricname, metric value  
server026-AckDelayAverage-00:01:00,server026,May 24 2016,00:01:00,AckDelayAverage,942  
server026-AckDelayMax-00:01:00,server026,May 24 2016,00:01:00,AckDelayMax,5855  
server026-AckSent-00:01:00,server026,May 24 2016,00:01:00,AckSent,49038  

My logic is :

Loop through file1  
If key found in File2  
    print file1.key , file2.field3 , file2.field6 to file3  
else  
    print file1.key + 'KEY_NOT_FOUND' text to file3  
fi    

So the file3 output should have a row for every record in file1.

The code below seems to work , but could anyone suggest a more efficient and elegant method of achieving this ?

while read key ;  
do  
        metric_found=`grep $key file2`  
    if [[ ! -z $metric_found ]]  
    then  
            echo ${metric_found} | awk -F "," '{print $1",$3,"$6}'  
    else  
            echo ${key},KEY_NOT_FOUND  
    fi  
done < file1  

Example output from existing script based on the sample data :

server026-AckDelayAverage-00:01:00,May 24 2016,942  
server026-AckDelayMax-00:01:00,May 24 2016,5855  
server026-AckSent-00:01:00,May 24 2016,49038  
server026-DigEnvValidationLatestTime-23:59:00,KEY_NOT_FOUND  
server026-DigEnvValidationTimeAverage-23:59:00,KEY_NOT_FOUND  

thanks..

试试这个:

awk 'BEGIN{FS=OFS=","}NR==FNR{a[$1]=1;b[$1]=$3;c[$1]=$6;}NR>FNR{if (a[$1]) print $1,b[$1],c[$1]; else print $1,"KEY_NOT_FOUND";}' file2 file1 > file3
$ cat tst.awk
BEGIN { FS=OFS="," }
NR==FNR { file2[$1] = $3 OFS $6; next }
FNR>1 { print $1, ($1 in file2 ? file2[$1] : "KEY_NOT_FOUND") }

$ awk -f tst.awk file2 file1
server026-AckDelayAverage-00:01:00,May 24 2016,942
server026-AckDelayMax-00:01:00,May 24 2016,5855
server026-AckSent-00:01:00,May 24 2016,49038
server026-DigEnvValidationLatestTime-00:01:00,KEY_NOT_FOUND
server026-DigEnvValidationTimeAverage-00:01:00,KEY_NOT_FOUND

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM