简体   繁体   中英

How to make bash script make a log file form from keywords

I tried to make 1 log file CSV from a complicated file with bash script, i tried but just found the keyword from the log file, please help me.

example complicated log files over ( 10k lines):

"$date1" "url=$a1&http=$a2&ip=$a3&from=$a4"

"$date2" "url=$b1&http=$b2&from=$a4&sip=$b5"

"$date3" "url=$c1&http=$c2&ip=$c3&UID=$c6&K-Id=c8"

"$date4" "http=$d2&ip=$d3&from=$d4&utm_id=$d7"

I found key words and make it a file like this:

url
http
ip
from
sip
UID
utm_id

and I must to find how make a bash script to a file form csv like this:

DATE    URL   HTTP   IP   FROM   SIP   UID   utm_ID     K_id

$date1  a1     a2    a3   a4

$date2  b1     b2         b4      b5

$date3  c1     c2    c3                c6                 c8

$date4  d1     d2    d3   d4                  d7

Please help me.

Here is a workable example written in gawk, tested with the data in your question.

log.awk

/.*=.*/ { # ignore all lines without url parameters
for (i=5;i<NF;i+=2) 
    d[substr($2,0,10)][$i]++
    # if your date format is 2017-02-09T06:15:24.349847Z, change to
    # d[$2][$i]++
}

END {
for (i in d) {
    for (j in d[i]) {
        t[j]++ # find all paramters
    }
}

# print header
printf "DATE"
for (p in t) {
    printf "\t\t%s",toupper(p)
}
printf "\n"
for (i in d) {
    printf "%s",i
    for (p in t) {
        if (p in d[i]) {
            printf "\t\t%s",d[i][p]
        } else {
            printf "\t\t"
        }
    }
    printf "\n"
}
}

Save the content above as file log.awk , then in your bash shell, run as

$ gawk -F '["&=?]' -f log.awk little-output.log
DATE    HTTP    FROM    UTM_ID  URL K-ID    UID IP  SIP
$date1  1   1       1           1   
$date2  1   1       1               1
$date3  1           1   1   1   1   
$date4  1   1   1               1   

The pasted result here didn't get formatted well, but result is fine in your shell output, or your can redirect the output to a file.

Here's something to get you started. You can run it like:

./script_below some_log_file.log

The approach is basically:

for each line:
    initialize a new empty key-value map
    save the date into map
    for key/value pairs after date:
        put key value pair into map

    print the contents of the map

Here's the implementation in Bash:

#!/bin/bash

set -e

readonly input_file="$1"

format="%s"
for i in {0..8}; do
    format="%7s$format"
done
format="$format\n"

known_keys=("date" "url" "http" "ip" "from" "sip" "UID" "utm_id" "K-Id")
printf "$format" ${known_keys[@]}

while read line; do
    unset attrs
    declare -A attrs

    vals=(${line//\"/})
    attrs['date']=${vals[0]}

    sub_vals=(${vals[1]//[=&]/ })

    set -- ${sub_vals[@]}
    while [ $# -gt 0 ]; do
        attrs["$1"]="${2/$/}"
        shift
        shift
    done

    printf "$format" \
        "${attrs['date']}" "${attrs['url']}" "${attrs['http']}" "${attrs['ip']}" \
        "${attrs['from']}" "${attrs['sip']}" "${attrs['UID']}" "${attrs['utm_id']}" "${attrs['K-Id']}"


done < "$input_file"

This prints:

   date    url   http     ip   from    sip    UID utm_id   K-Id
 $date1     a1     a2     a3     a4                            

 $date2     b1     b2            a4     b5                     

 $date3     c1     c2     c3                   c6            c8

 $date4            d2     d3     d4                   d7       

Oh also final note: while I have illustrated that this can indeed be done in Bash, I would recommend a full-blown, proper programming language, instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM