[英]Awk printing out smallest and highest number, in a time format
I'm fairly new to linux/bash shell and I'm really having trouble printing two values (the highest and lowest) from a particular column in a text file. 我是linux / bash shell的新手,我真的无法从文本文件中的特定列打印两个值(最高和最低)。 The file is formatted like this:
该文件的格式如下:
Geoff Audi 2:22:35.227
Bob Mercedes 1:24:22.338
Derek Jaguar 1:19:77.693
Dave Ferrari 1:08:22.921
As you can see the final column is a timing, I'm trying to use awk to print out the highest and lowest timing in the column. 正如您所看到的,最后一列是时间,我正在尝试使用awk打印出列中的最高和最低时间。 I'm really stumped, I've tried:
我真的很难过,我试过了:
awk '{print sort -n < $NF}' timings.txt
However that didn't even seem to sort anything, I just received an output of: 然而,这甚至没有任何排序,我刚收到的输出:
1
0
1
0
...
Repeating over and over, it went on for longer but I didn't want a massive line of it when you get the point after the first couple iterations. 一遍又一遍地重复,它持续了更长时间,但是当你在第一次迭代之后得到点时我不想要它的大量线。
My desired output would be: 我想要的输出是:
Min: 1:08:22.921
Max: 2:22:35.227
After question clarifications : if the time field always has a same number of digits in the same place, eg h:mm:ss.ss
, the solution can be drastically simplified. 问题澄清 后 :如果时间字段总是具有在同一个地方一个相同的位数,例如
h:mm:ss.ss
,溶液可以显着地简化。 Namely, we don't need to convert time to seconds to compare it anymore, we can do a simple string/lexicographical comparison: 也就是说,我们不需要将时间转换为秒来进行比较,我们可以进行简单的字符串/词典编纂比较:
$ awk 'NR==1 {m=M=$3} {$3<m&&m=$3; $3>M&&M=$3} END {printf("min: %s\nmax: %s",m,M)}' file
min: 1:08:22.921
max: 2:22:35.227
The logic is the same as in the (previous) script below, just using a simpler string-only based comparison for ordering values (determining min/max). 逻辑与下面的(上一个)脚本中的逻辑相同,只是使用更简单的基于字符串的比较来排序值(确定最小值/最大值)。 We can do that since we know all timings will conform to the same format, and if
a < b
(for example "1:22:33" < "1:23:00"
) we know a
is "smaller" than b
. 我们可以做到这一点,因为我们知道所有的时间都符合相同的格式,如果
a < b
(例如"1:22:33" < "1:23:00"
)我们知道a
比b
更“小”。 (If values are not consistently formatted, then by using the lexicographical comparison alone, we can't order them, eg "12:00:00" < "3:00:00"
.) (如果值的格式不一致,那么单独使用词典比较,我们无法对它们进行排序,例如
"12:00:00" < "3:00:00"
。)
So, on first value read (first record, NR==1
), we set the initial min/max value to the timing read (in the 3rd field). 因此,在第一个值读取(第一个记录,
NR==1
)时,我们将初始最小/最大值设置为读取的时间(在第3个字段中)。 For each record we test if the current value is smaller than the current min, and if it is, we set the new min. 对于每个记录,我们测试当前值是否小于当前min,如果是,我们设置新的min。 Similarly for the max.
同样的最大值。 We use short circuiting instead
if
to make expressions shorter ( $3<m && m=$3
is equivalent to if ($3<m) m=$3
). if
要使表达式更短,我们使用短路( $3<m && m=$3
相当于if ($3<m) m=$3
)。 In the END
we simply print the result. 在
END
我们只需打印结果。
Here's a general awk
solution that accepts time strings with variable number of digits for hours/minutes/seconds per record: 这是一个通用的
awk
解决方案 ,接受每个记录的小时/分钟/秒的可变位数的时间字符串:
$ awk '{split($3,t,":"); s=t[3]+60*(t[2]+60*t[1]); if (s<min||NR==1) {min=s;min_t=$3}; if (s>max||NR==1) {max=s;max_t=$3}} END{print "min:",min_t; print "max:",max_t}' file
min: 1:22:35.227
max: 10:22:35.228
Or, in a more readable form: 或者,以更易读的形式:
#!/usr/bin/awk -f
{
split($3, t, ":")
s = t[3] + 60 * (t[2] + 60 * t[1])
if (s < min || NR == 1) {
min = s
min_t = $3
}
if (s > max || NR == 1) {
max = s
max_t = $3
}
}
END {
print "min:", min_t
print "max:", max_t
}
For each line, we convert the time components (hours, minutes, seconds) from the third field to seconds which we can later simply compare as numbers. 对于每一行,我们将时间分量(小时,分钟,秒)从第三个字段转换为秒,我们稍后可以将其作为数字进行比较。 As we iterate, we track the current min val and max val, printing them in the
END
. 在迭代时,我们跟踪当前的最小值和最大值,并在
END
打印它们。 Initial values for min and max are taken from the first line ( NR==1
). min和max的初始值取自第一行(
NR==1
)。
Given your statements that the time field is actually a duration and the hours component is always a single digit, this is all you need: 鉴于您的陈述时间字段实际上是一个持续时间而小时组件始终是一个数字,这就是您所需要的:
$ awk 'NR==1{min=max=$3} {min=(min<$3?min:$3); max=(max>$3?max:$3)} END{print "Min:", min ORS "Max:", max}' file
Min: 1:08:22.921
Max: 2:22:35.227
You don't want to run sort inside of awk (even with the proper syntax). 你不想在awk中运行sort(即使使用正确的语法)。
Try this: 试试这个:
sed 1d timings.txt | sort -k3,3n | sed -n '1p; $p'
where 哪里
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.