简体   繁体   中英

Parse single log line with awk

I am trying to find a way to parse a single (apache) log line into blocks. I know I can change apache config to create a json, but I believe this awk knowledge will help me in the future.

So I have this:

127.0.1.1:80 187.207.66.53 - - [18/Jan/2021:18:28:22 +0100] "GET / HTTP/1.1" 200 2352 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"

And want to change it into this:

127.0.1.1:80
187.207.66.53
-
-
[18/Jan/2021:18:28:22 +0100]
"GET / HTTP/1.1"
200
2352
[...]

So basically I believe I need to set up different field separators, am I right?

-F '[<fieldSeparator1>|<fieldSeparator2> ]' '{
for (i = 1; i<= NF; i++)
print $i
}'

With GNU awk and a regex. Tested only with your example.

awk '{$1=$1; print}' OFS='\n' FPAT='"[^"]*"|\\[[^]]*]|[0-9:.]+|-' file

FPAT : A regular expression describing the contents of the fields in a record. When set, gawk parses the input into fields, where the fields match the regular expression, instead of using the value of FS as the field separator.

Output:

127.0.1.1:80
187.207.66.53
-
-
[18/Jan/2021:18:28:22 +0100]
"GET / HTTP/1.1"
200
2352
"-"
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"

See: man awk and The Stack Overflow Regular Expressions FAQ

With GNU awk for the 3rd arg to match():

$ awk '
    match($0,/(\S+) (\S+) (\S+) (\S+) (\[[^]]*]) ("[^"]*") (\S+) (\S+) ("[^"]*") ("[^"]*")/,f) {
        for (i=1; i in f; i++) {
            print f[i]
        }
    }
' file
127.0.1.1:80
187.207.66.53
-
-
[18/Jan/2021:18:28:22 +0100]
"GET / HTTP/1.1"
200
2352
"-"
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM