The input is as below
A 20 240
A 15 150
B 65 210
B 80 300
C 90 400
C 34 320
For each category (labelled as A,B,C..in the 1st column), I'd like to find the minimum as well as maximum numbers (as biggest range). So expect to see:
A 15 240
B 65 300
C 34 400
So how could I do using bash?
Using awk:
awk '
!($1 in min) { min[$1] = $2; max[$1] = $3; next }
{
min[$1] = ( $2 < min[$1] ? $2 : min[$1] )
max[$1] = ( $3 > max[$1] ? $3 : max[$1] )
}
END {
for(x in min) print x, min[x], max[x]
}' file
A 15 240
B 65 300
C 34 400
We iterate each line and assign min and max values to a map that has first column as the key. In the END block we iterate the hash and print out the key and values from both maps.
I tried to make an other solution (as a workaround) of the side affect of the unset variables in awk. (May be this is a little bit more optimized.)
cat min_max
#!/bin/bash
awk '
NF!=3 || $2 $3 ~ "[^0-9-]" {next;} # short filter
min[$1]=="" {min[$1]=$2; max[$1]=$3; next;} # first occur a given ID--> set min&max,read nxt ln
min[$1]>$2 {min[$1]=$2;} # other occur IDs--> refresh min if required
max[$1]<$3 {max[$1]=$3;} # refreshing max if required
END {for(x in min)printf("%-2s %5d %5d\n", x, min[x], max[x]);}
' $1
cat in4
A 20 240
B 65 210
C 90 400
A 15 150
C 34 320
E -30 -20
D 0 100
B 80 300
D 10 90
E -20 -10
./min_max in4
A 15 240
B 65 300
C 34 400
D 0 100
E -30 -10
This bash code produces the same.
cat min_max2
#!/bin/bash
(($#!=1))&& { echo "Usage $0 inpfile"; exit 1; }
declare -A min max # define associative arrays
while read id mn mx; do
[[ ${min[$id]+any} == "" ]] && { min[$id]=$mn; max[$id]=$mx; continue; } # parameter extension
(( ${min[$id]} > $mn )) && min[$id]=$mn
(( ${max[$id]} < $mx )) && max[$id]=$mx
done <$1
for i in ${!min[@]}; do printf "%-2s %5d %5d\n" $i ${min[$i]} ${max[$i]}; done
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.