简体   繁体   中英

Perl + Bash one-liner to sum bytes sent from Apache log files: Can this be right?

A client has requested their outgoing bandwidth usage.

Our Apache logs have lines like the following, where 36618 represents the outgoing request size in bytes:

111.111.111.11 - - - foo.org [23/May/2014:01:00:15 -0400] 0 36618 "GET /baz/bar.html HTTP/1.1" 200 3734 "http://foo.org/baz/bar.html" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36"

I constructed this (mostly) Perl one-liner to sum up all of those numbers by grabbing it via regex, and appending it to an empty array, then dumping the array delimited by '+' through the Bash Calculator. (the below line ultimately gets piped through |paste -sd+|bc The log file is fed into the below line on standard input.

@BYTES = (); while(<>) { push(@BYTES, $1) if ( $_ =~ qr/] (?:\\d+|-) (\\d+)/) }; foreach(@BYTES) { print "$_\\n" }

However, I am seeing much higher usage than I would expect, multiple gigabytes in just a few days. That cannot be right. What's wrong here?

UPDATE See my comment below, I had the wrong field, the field I chose was time taken to serve request in microseconds, bound to be higher than size of request in bytes.

However, I am seeing much higher usage than I would expect, multiple gigabytes in just a few days. That cannot be right. What's wrong here?

It might be that you're looking at wrong field, check your LogFormat in webserver configuration

111.111.111.11 - - - foo.org [23/May/2014:01:00:15 -0400] 0 36618 "GET /baz/bar.html HTTP/1.1" 200 3734 " http://foo.org/baz/bar.html " "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36"

这似乎可行:

cat file.log | sed 's/"http.*//' | awk '{print $NF}' | paste -sd+ | bc

Perhaps:

sum=$( perl -anE '$sum += $F[8]} END {say $sum' file.log )

Assumes the size is always in the 9th field of a line.

使用perl单线从正确的字段中提取:

perl -ne '$s += $1 if /"\s+\d+\s+(\d+)\s+"/ }{ print "$s\n"' access_log

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM