简体   繁体   中英

awk split string with no delimiter

I have the following result/stat file from running tests that I want to analyze using awk:

$date     $time    $statname $traffic_rate $val1 $val2
20140909 132920326 stat1     30/sec        40    80
20140909 132950326 stat1     29/sec        60    20
20140909 133020326 stat1     28/sec        70    100
20140909 133050326 stat1     0/sec          0    0
20140909 133120326 stat1     0/sec          0    0
20140909 133150326 stat1     30/sec        90    50

The $time is in the format: HHMMSSmmm and stats are generated in 30 sec intervals. I need to average the $val and $val2 values for each consecutive stats that has $traffic_rate value >= '28/sec'. Ignore the stats with traffic_rate < 28/sec and repeat the process for the next series >= 28/sec and so forth.

I want to use bash script and thought awk will be a good choice for analyzing column data. In order to compare consecutive timestamp with $traffic_rate >= 28/sec I need to convert the $time with mktime. However, I cannot split $time since there is no delimiter. Is there a way to split by char counts like in PHP?

The sample output will be as follows:

test# $val   $val2
1      170/3 200/3
2      90/1  50/1

That is, each consecutive >= 28/sec is a single test result and should be computed separately.

Also, any other recommendation to analyze these type of pattern will be appreciated. Thanks.

Using awk :

awk -v OFS="\t" '
BEGIN { print "test#", "$val", "$val2" }
$4 == "0/sec" && count { 
    print ++id, val1/count, val2/count
    count = val1 = val2 = 0
} 
$4+0>=28 && NR>1 { 
    val1+=$5
    val2+=$6
    ++count
}
END { 
    print ++id, val1/count, val2/count
}' file
test#   $val      $val2
1       56.6667  66.6667
2       90       50

You can accomplish what you need with a short script that averages val1 val2 if the traffic_rate is per 30 seconds:

#!/bin/bash

## validate data file input
[ -f "$1" ] || {
    printf "\nError: insufficient input. File '%s' not found.\n\n" "${0//\//}"
    exit 1
}

declare -i cnt=0                    # simple count variable

printf "\n    val1    val2\n\n"     # print generic header

## read each line in file
while read -r dt tm sn trf v1 v2 || [ -n "$dt" ]; do

    trf=${trf%/*}               # extract numeric traffic_rate

    if [ "$trf" = 30 ]; then    # if equal to 30
        v1a+=( $v1 )            # add values to v1 array and v2 array
        v2a+=( $v2 )
        ((cnt++))
    else
        v1s=0                   # reset v1 sum and v2 sum
        v2s=0
        for i in ${v1a[@]}; do v1s=$((v1s+i)); done # calculate v1 sum from v1 array
        for i in ${v2a[@]}; do v2s=$((v2s+i)); done # calculate v2 sum from v2 array
        if [ $v1s -gt 0 ] && [ $v2s -gt 0 ]; then   # if both greater than 0, output
            printf "  %6s  %6s\n" \
            $( echo "scale=2; $v1s/$cnt" | bc ) $( echo "scale=2; $v2s/$cnt" | bc )
        fi
        cnt=0
        unset v1a v2a
    fi

done <"$1"

## output if array elements remain
if [ ${#v1a[@]} -gt 0 ]; then
    v1s=0
    v2s=0
    for i in ${v1a[@]}; do v1s=$((v1s+i)); done
    for i in ${v2a[@]}; do v2s=$((v2s+i)); done
    if [ $v1s -gt 0 ] && [ $v2s -gt 0 ]; then
        printf "  %6s  %6s\n" \
        $( echo "scale=2; $v1s/$cnt" | bc ) $( echo "scale=2; $v2s/$cnt" | bc )
    fi
    cnt=0
    unset v1a v2a
fi

printf "\n"

exit 0

output:

$ bash avg30.sh dat/split.dat

    val1    val2

    56.66   66.66
    90.00   50.00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM