I tried to make 1 log file CSV from a complicated file with bash script, i tried but just found the keyword from the log file, please help me.
example complicated log files over ( 10k lines):
"$date1" "url=$a1&http=$a2&ip=$a3&from=$a4"
"$date2" "url=$b1&http=$b2&from=$a4&sip=$b5"
"$date3" "url=$c1&http=$c2&ip=$c3&UID=$c6&K-Id=c8"
"$date4" "http=$d2&ip=$d3&from=$d4&utm_id=$d7"
I found key words and make it a file like this:
url
http
ip
from
sip
UID
utm_id
and I must to find how make a bash script to a file form csv like this:
DATE URL HTTP IP FROM SIP UID utm_ID K_id
$date1 a1 a2 a3 a4
$date2 b1 b2 b4 b5
$date3 c1 c2 c3 c6 c8
$date4 d1 d2 d3 d4 d7
Please help me.
Here is a workable example written in gawk, tested with the data in your question.
log.awk
/.*=.*/ { # ignore all lines without url parameters
for (i=5;i<NF;i+=2)
d[substr($2,0,10)][$i]++
# if your date format is 2017-02-09T06:15:24.349847Z, change to
# d[$2][$i]++
}
END {
for (i in d) {
for (j in d[i]) {
t[j]++ # find all paramters
}
}
# print header
printf "DATE"
for (p in t) {
printf "\t\t%s",toupper(p)
}
printf "\n"
for (i in d) {
printf "%s",i
for (p in t) {
if (p in d[i]) {
printf "\t\t%s",d[i][p]
} else {
printf "\t\t"
}
}
printf "\n"
}
}
Save the content above as file log.awk
, then in your bash shell, run as
$ gawk -F '["&=?]' -f log.awk little-output.log
DATE HTTP FROM UTM_ID URL K-ID UID IP SIP
$date1 1 1 1 1
$date2 1 1 1 1
$date3 1 1 1 1 1
$date4 1 1 1 1
The pasted result here didn't get formatted well, but result is fine in your shell output, or your can redirect the output to a file.
Here's something to get you started. You can run it like:
./script_below some_log_file.log
The approach is basically:
for each line:
initialize a new empty key-value map
save the date into map
for key/value pairs after date:
put key value pair into map
print the contents of the map
Here's the implementation in Bash:
#!/bin/bash
set -e
readonly input_file="$1"
format="%s"
for i in {0..8}; do
format="%7s$format"
done
format="$format\n"
known_keys=("date" "url" "http" "ip" "from" "sip" "UID" "utm_id" "K-Id")
printf "$format" ${known_keys[@]}
while read line; do
unset attrs
declare -A attrs
vals=(${line//\"/})
attrs['date']=${vals[0]}
sub_vals=(${vals[1]//[=&]/ })
set -- ${sub_vals[@]}
while [ $# -gt 0 ]; do
attrs["$1"]="${2/$/}"
shift
shift
done
printf "$format" \
"${attrs['date']}" "${attrs['url']}" "${attrs['http']}" "${attrs['ip']}" \
"${attrs['from']}" "${attrs['sip']}" "${attrs['UID']}" "${attrs['utm_id']}" "${attrs['K-Id']}"
done < "$input_file"
This prints:
date url http ip from sip UID utm_id K-Id
$date1 a1 a2 a3 a4
$date2 b1 b2 a4 b5
$date3 c1 c2 c3 c6 c8
$date4 d2 d3 d4 d7
Oh also final note: while I have illustrated that this can indeed be done in Bash, I would recommend a full-blown, proper programming language, instead.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.