[英]AWK matching values in a column and performing calculation
I'm new at AWK and I'm trying to figure out an answer for my problem. 我是AWK的新手,我正试图找出问题的答案。 I have a flat file with the following values:
我有一个包含以下值的平面文件:
403 | SanMateo | f | 2015-04-09 18:50:24.38
403 | SanMateo | t | 2015-04-09 18:45:24.36
403 | SanMateo | t | 2015-04-09 18:40:24.383
403 | SanMateo | f | 2015-04-09 18:35:24.357
403 | SanMateo | t | 2015-04-09 18:30:24.355
404 | RedwoodCity| f | 2015-04-09 18:35:50.308
404 | RedwoodCity| t | 2015-04-09 18:30:50.242
404 | RedwoodCity| f | 2015-04-09 18:25:50.245
404 | RedwoodCity| t | 2015-04-09 18:20:50.242
404 | RedwoodCity| f | 2015-04-09 18:15:50.242
I want to use awk to compare $1 of the current line to $1 of the next line, and $3 ~/f/. 我想使用awk将当前行的$ 1与下一行的$ 1进行比较,并且$ 3~ / f /。 if the statement is true then subtract $4 of the next line from $4 of the current line and write the difference in a new column of the current line and if false then do nothing.
如果该语句为真,则从当前行的$ 4中减去下一行的$ 4,并将差值写入当前行的新列中,如果为false则不执行任何操作。 what I have so far is this:
到目前为止我所拥有的是:
awk 'BEGIN {FS="|";} {if (NR $1 ~ NR++ $1 && $3 ~ /f/) subtract = NR $4 - NR++ $4; {print subtract}}' allHealthRecords_Sorted
and obviously that's not working. 显然这不起作用。 Can someone please help?
有人可以帮忙吗?
save this as time_diff.awk
将其保存为
time_diff.awk
BEGIN {FS = "[[:blank:]]*\\|[[:blank:]]*"}
# convert "YYYY-mm-dd HH:MM:SS.fff" to a number
function to_time(timestamp, fraction) {
fraction = timestamp
sub(/\..*$/, "", timestamp)
gsub(/[-:]/, " ", timestamp)
sub(/.*\./, "0.", fraction)
return mktime(timestamp) + fraction
}
# gawk has no builtin abs() function
function abs(val) {
return( val < 0 ? -1*val : val)
}
# add the time diff if the condition is met
NR > 1 {
diff = 0
if ($1+0 == key && flag == "f")
diff = abs( to_time($4) - to_time(time) )
print line (diff > 0 ? " | " diff : "")
}
{
# remember the previous line's values
key = $1+0; flag = $3; time = $4; line = $0
}
END {print}
Then 然后
$ gawk -f time_diff.awk file
403 | SanMateo| f | 2015-04-09 18:50:24.38 | 300.02
403 | SanMateo| t | 2015-04-09 18:45:24.36
403 | SanMateo| t | 2015-04-09 18:40:24.383
403 | SanMateo| f | 2015-04-09 18:35:24.357 | 300.002
403 | SanMateo| t | 2015-04-09 18:30:24.355
404 | RedwoodCity| f | 2015-04-09 18:35:50.308 | 300.066
404 | RedwoodCity| t | 2015-04-09 18:30:50.242
404 | RedwoodCity| f | 2015-04-09 18:25:50.245 | 300.003
404 | RedwoodCity| t | 2015-04-09 18:20:50.242
404 | RedwoodCity| f | 2015-04-09 18:15:50.242
You don't show your expected output so we can't test it, and $4 is a date so idk what you mean by "subtract" but this is basically the right approach: 你没有显示你的预期输出,所以我们无法测试它,4美元是一个日期,所以idk你的意思是“减去”,但这基本上是正确的方法:
$ cat tst.awk
BEGIN{ FS="[[:space:]]*[|][[:space:]]*"; OFS=" | " }
split(prev,p) { print prev ( ($1==p[1])&&(p[3]=="f") ? OFS p[4] - $4 : "") }
{ prev = $0 }
END { print prev ( ($1==p[1])&&(p[3]=="f") ? OFS p[4] - $4 : "") }
$ awk -f tst.awk file
403 | SanMateo | f | 2015-04-09 18:50:24.38 | 0
403 | SanMateo | t | 2015-04-09 18:45:24.36
403 | SanMateo | t | 2015-04-09 18:40:24.383
403 | SanMateo | f | 2015-04-09 18:35:24.357 | 0
403 | SanMateo | t | 2015-04-09 18:30:24.355
404 | RedwoodCity| f | 2015-04-09 18:35:50.308 | 0
404 | RedwoodCity| t | 2015-04-09 18:30:50.242
404 | RedwoodCity| f | 2015-04-09 18:25:50.245 | 0
404 | RedwoodCity| t | 2015-04-09 18:20:50.242
404 | RedwoodCity| f | 2015-04-09 18:15:50.242
ie you have a buffer of 1 line so you're always operating on and outputing the previous line that you read. 即你有1行的缓冲区,所以你总是在操作并输出你读过的前一行。
In the BEGIN action, read the first line with getline
and save the values of $1 and $4. 在BEGIN操作中,使用
getline
读取第一行并保存$ 1和$ 4的值。
On each line thereafter, compare $1 to the saved value from the previous line. 在此后的每一行上,将$ 1与上一行的保存值进行比较。 If they are the same, and
$3 ~ /f/
, do the desired process. 如果它们相同,并且
$3 ~ /f/
,则执行所需的过程。 Then save the values of $1 and $4 for the next line. 然后为下一行保存$ 1和$ 4的值。
That should be enough to get you started. 这应该足以让你开始。 If you have trouble writing the code, come back and ask more questions.
如果您在编写代码时遇到问题,请回过头来提出更多问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.